Use Case 1: Small TimesTen cache on DRAM

Thank you for your feedback!

The goal of this use case was to study the performance and resource utilization impact of a small TimesTen cache when caching relatively a small subset of the backend RAC database.
Configuration overview
The following figure shows the dataset configuration for Use Case 1.
Figure 2.              Use Case 1: Small (400M Subs) dataset configuration overview
In Use Case 1, we tested with two configurations as shown by the two adjacent columns in the figure above. The left column in the figure shows the baseline configuration and the right column in the figure shows the TimesTen test configuration.
In this use case, the two-socket PowerEdge server used as the application (TimesTen) server hosts both the HLR benchmark application and TimesTen. It has a total physical memory capacity of 768 GB using DRAM-only modules. Given this main memory size, we could explicitly load a maximum of 400 million SUBSCRIBER rows (400M subscribers) into the TimesTen cache. Therefore, we set the target workload data size for both the test configurations in this use case to 400M subscribers. 400M subscribers is only sixteen percent of the total HLR schema size of 2,500 million SUBSCRIBER rows in the backend RAC database; therefore, this use case represents the scenario in which a relatively small subset of the RAC database is cached in the TimesTen cache.
Each of the two-node RAC servers has 768 GB of physical memory. We set the Oracle database’s System Global Area (SGA) to 384 GB. For details on the RAC and TimesTen database configuration, refer to Appendix: Configuration details.
Test methodology
During the baseline tests, we ran the HLR benchmarking workload directly against the backend two-node RAC database setup with a target workload size of 400M subscribers. During each of the baseline tests, we ran seven workload iterations with increasing HLR application thread counts of 2, 4, 8, 16, 32, 48, and 64. We repeated the baseline tests three times and reported the averages of the three runs.
Note: During the baseline tests, the TimesTen database and cache were disabled on the application server.
During the TimesTen tests, we first explicitly loaded 400M subscribers into the TimesTen cache running on 768 GB of DRAM and configured in Asynchronous Writethrough (AWT) caching mode. We then ran the HLR benchmarking workload against the TimesTen cache limiting the target throughput to the maximum TPS rate achieved against RAC.
Note: As described in section Test methodology overview the limited target TPS rate for the TimesTen tests were nearly the same as the highest TPS achieved by the baseline tests.
Performance results
The figure below shows the results of the baseline tests generated by the HLR application for the workload target size of 400M subscribers that was run directly against the RAC database. The TPS values are plotted relative to the TPS throughput achieved at workload thread count 2 iteration. All latency values are plotted relative to the InsertCallForwarding (ICF) transaction latency observed at workload thread count 2 iteration.
Figure 3.              Use Case 1 - 400M subscribers - Baseline (RAC-Only) - Relative TPS and Latency performance graph
As shown in the figure above, the RAC database and backend infrastructure scaled well in terms of TPS, delivering 24x more TPS at the peak workload thread count of 64 when compared to the TPS at the starting workload thread count of 2. However, the latency of DML transactions noticeably increased as the workload increased. This is reflected in the figure above by the relative transaction latency values which increased by up to 44 percent for the two write intensive transactions—InsertCallForwarding and UpdateSubscriberData.
The figure below compares the application latencies for each of the seven HLR transactions between the baseline tests and the TimesTen cache tests.
Figure 4.              Use Case 1 (400M subscribers) - Average Latencies: Baseline (RAC-only) vs TimesTen cache with RAC
To reiterate the test methodology, the above transaction latencies were captured for both the baseline and the TimesTen tests when they both delivered near about the same peak TPS throughput performance. All latencies in the figure above are relative to the baseline InsertCallForwarding (ICF) transaction latency. As shown in the figure above, in Use Case 1 where we tested with a relatively small dataset size (400M subscribers), the TimesTen cache could deliver the same throughput as the RAC database but with an average of 37x better transaction response time across the seven different mixed types of query and DML transactions.
Part of the study was also to determine how a TimesTen cache impacts the resource utilization of the backend RAC database and its infrastructure.
The following figure shows the CPU utilization of RAC node2 as reported by the Oracle Automatic Workload Repository (AWR) report that we captured during both the baseline and the TimesTen cache tests.
Figure 5.              RAC node2 CPU utilization - Use Case 1 - Baseline vs TimesTen cache tests
As shown in the AWR report, the RAC node2’s CPU utilization went from 12.3 percent (8.8 %User + 3.5 %System) during the baseline tests down to 8 percent (5.2 %User + 2.8 %System) during the TimesTen tests—a delta of 4 percent. Though not significant for our particular test setup, this definitely provides a data point that the TimesTen cache helped to offload queries and reduce the RAC node’s CPU utilization. Therefore, TimesTen cache can help to improve database consolidation on fewer servers and provide better return on RAC infrastructure investment.
Similarly, when we compared the ‘Top 10 Foreground Events by Total Wait Time’ from the AWR reports of the two test cases—baseline vs TimesTen—we observed that during the TimesTen cache tests the ‘log file sync’ wait events that were observed during the baseline tests were nearly eliminated (65.9% DB time in baseline vs 5.4% DB time in TimesTen tests), as shown in the comparative figure below.
Figure 6.              Foreground Wait Events - Use Case 1 - Baseline vs TimesTen cache tests
Also, the RAC database spent more time using the CPU (which is a good thing) than waiting on other resources during the TimesTen tests (DB CPU = 76.9% DB time) when compared to its CPU utilization during the baseline tests (DB CPU = 41.6% DB time)—a delta of 35 percent. This finding provides another proof point that running a TimesTen cache contributes to better total system capacity and utilization of the backend RAC database.
Note: The application latencies reported by the HLR application in case of the TimesTen tests, is from the transaction commits to the local TimesTen cache running on DRAM. The RAC database performance metrics reported during the TimesTen tests are from the TimesTen replication agent propagating the transaction updates and committing them to the backend RAC database.

Your Browser is Out of Date

Use Case 1: Small TimesTen cache on DRAM

Use Case 1: Small TimesTen cache on DRAM

Configuration overview

Test methodology

Performance results