Collect AWR reports over peak workload periods to identify any potential bottlenecks. AWR averages all metrics over the duration of the report; therefore a 24-hour report is generally not useful. A useful AWR report is produced for a short period of time, for example, 15 minutes, 30 minutes, or one hour, when the workload is stable and heavy.
When using Oracle RAC, AWR reports can be produced for each instance separately or for the cluster as a whole. The Instance AWR metrics only represent the workload on that specific database server. The RAC AWR metrics represent the workload from the entire cluster. In the following examples, we show both types. Note that with each Oracle release some changes were made to the report.
The Load Profile area in an Instance AWR report includes “Physical reads (blocks)”, “Physical writes (blocks)”, and “Logical reads” metrics, as shown in the following figure. The units for these metrics are database blocks. Block units cannot be directly translated to I/O metrics. However, using these numbers can provide an indication of the database I/O profile, such as reads versus writes ratio and how many of the reads are satisfied from the buffer cache (logical reads) versus actual read I/Os (physical reads).
Typically, expect a high percent of an OLTP read workload to be satisfied by database cache (logical reads). In benchmarks, we often limit the database cache size to generate more I/Os and the numbers are much closer to each other.
Note that in an AWR report for an Oracle 12c database, the load profile also provides actual I/O metrics (as shown by the second group of highlighted metrics).
Figure 33. Load Profile section in a single Instance AWR Report
The cluster AWR report provides similar information, as shown in the following figure.
Figure 34. Cluster AWR report: Load Profile
Ideally, the database should wait most of the time for CPU and I/O, which is an indication that the system is working at its physical limit. Ensure that the db file sequential read field of the AWR report (which actually means random-read) has an average wait time appropriate to the storage type and application needs. For example, 596usec, or 0.6ms I/O latency, as shown in the following figure.
Figure 35. Cluster AWR report: Top Timed Events
The log file parallel write metric indicates how fast the Oracle Log Writer is able to write the redo logs to storage. Oracle can generate redo log writes in different I/O sizes, based on the load (up to 1 MB in size). The larger the I/O, the longer it takes to complete. However, we see in this example that t log file parallel write metric shows 1.24 ms write latency, which is a an excellent number for large I/Os.
To find IOPS and MB/sec I/O-related metrics in the AWR report, look for the physical read total IO requests, physical write total IO requests, physical read total bytes, and physical write total bytes metrics. These metrics provide read IOPS, write IOPS, read bandwidth, and write bandwidth.
The following figure shows that the cluster executed 385,808 read I/Os per second, 108,197 write I/Os per second, 2.96GB/sec read bandwidth (3,179,107,539 / 1024 / 1024 / 1024 to change from bytes/sec to GB/sec), and 0.84GB/sec write bandwidth (900,994,499 / 1024 / 1024 / 1024). Of course, the bandwidth is of more interest during DSS workloads.
Figure 36. Cluster AWR report: System Statistics
Redo logs are key to Oracle database resiliency and performance. The Oracle write size to the logs ranges from 512 bytes up to 1 MB. Oracle switches to the next log file based on multiple conditions, such as how full the log buffer and the log file are, and time.
Configure the redo logs size so that Oracle switches log files only a small number of times per hour. Ensure that there are enough log files so that they never wait for the archiving processes to complete at log switch time.
The log switch count is in the Instance AWR report, as shown in the following figure. The number under Total is how many switches occurred during the AWR report duration. The per Hour value is a derived number, an estimate based on the activity during the AWR report.
Figure 37. Instance AWR report: log_switch