While moving a customer from On-Premises to Oracle Cloud Infrastructure, we recently conducted some performance testing before the final cutover. For the E-Business Suite database server, a physical box with 16 CPU cores was migrated to a VMStandard E3.Flex VM with 12 OCPU. From a CPU performance perspective, the new VM was much faster than the old box. Unfortunately, this was not the case on the I/O performance side.
Let's take a look at how we measured the I/O performance and what the problem was. Then, in 2 more blog posts, I will look at two different approaches to solving this problem and their corresponding performance.
Oracle gives a fairly detailed description of the I/O performance of block volumes (and that's all we are considering at the moment, since that’s the default location of the database) in OCI.
Basically there are 3 factors to consider for both throughput and IOPS:
- Network-Speed of the Compute Instance (for E3 this is 1GBit/OCPU)
- Performance Class can either be "Lower Cost" (1TB costs 26$ per month), "Balanced" (43$/TB) or "Performance" (60$/TB). The more expensive option you choose, the better the performance
- Size of the volume (up to a specific cut-off size)
The exact relationships / numbers are listed in the following table:
For the "Balanced" Block Volume type this means (depending on the volume size):
E-Business Suite Cloud Manager provisioned Database (on Compute)
When creating a compute instance to run the Oracle database (12c or 19c) for an E-Business Suite instance (12.1 and 12.2), Oracle creates only one large block volume containing both the Oracle home database and all of the data from archive log files. For one customer this was a volume of 4.5 TB. Unfortunately, this means that "only" a maximum throughput of 480MByte/s at 25,000 IOPS is available. Of course, this is only true if the compute instance has enough network bandwidth. In the case of E3.Flex instances, this means that at least 4 OCPUs are required (better 6 to have some buffer).
There are many ways to measure the I/O Performance on Linux / for Oracle Databases. A convenient way is to let the database measure the performance on its own: This has the advantage that it also works for ASM based storage, as we'll see in the next episode.
The following program performs a short performance test:
SET SERVEROUTPUT ON
DBMS_RESOURCE_MANAGER.calibrate_io (num_physical_disks => 7,
max_latency => 20,
max_iops => l_iops,
max_mbps => l_mbps,
actual_latency => l_latency);
DBMS_OUTPUT.put_line('Max IOPS = ' || l_iops);
DBMS_OUTPUT.put_line('Max MBPS = ' || l_mbps);
This gives us the numbers we would expect in several successive runs (on a 6 OCPU instance with a single 3.5 TB block volume of type "balanced performance"):
Max IOPS = 23833
Max MBPS = 448
When changing from "Balanced" to "Higher Performance" the numbers get a bit better (even though I expected the IOPS to increase even more):
Max IOPS = 28426
Max MBPS = 468
The obvious approach to improving this is to split up a block volume into possibly smaller block volumes. When using "Balanced Performance", it should be sufficient to reduce each block volume to about 1 TB in order to achieve "maximum throughput". To achieve maximum IOPS throughput, it might be even better to divide the single block volume into pieces of 500 GB each.
In two more blog posts, I'll describe how this can be achieved with either Oracle Automatic Storage Management (ASM) or by using the Linux Volume Management (LVM). Spoiler: With both methods the performance can be accelerated by a factor of 3-5 very easily and without additional costs.