GPDSB is a decision support benchmark that models several generally applicable aspects of a decision support system, including queries and data maintenance. This test creates a database schema, loads the data and then runs 99 decision-support type queries. The GPDSB test runs these 99 queries in random order with a different number of sessions specified as test parameters. The test is used to simulate real world scenarios in the database with simple to complex queries.
The GPDSB tool kit is downloaded from https://github.com/pivotal/TPC-DS. These scripts can measure the database performance by running both single and concurrent-user queries.
Note: The GPDSB like queries that were used in the tests and described in this document are not part of an audited benchmark and provided for educational purpose only.
As mentioned in the Logical architecture section for Greenplum, each ESXi host holds a single VM with five primary and five mirror segments. There are ten ESXi hosts and 100 segments in total. Out of the 100 segments, there were 50 primary segments to carry out the GPDSB tests. The configuration details of the master and segments are provided in the Appendix.
Initially, a trial GPDSB test was carried out to verify that the queries can run successfully. The smoke test was carried out with a scale factor of 1, which means 1 GB data load across the data segments of Greenplum. A score of 0 is expected as shown in the following result:
[root@bdldgpmtmp01a TPC-DS]# tail -f tpcds.log Scale Factor 1 Load 35 Analyze 38 1 User Queries 183 Concurrent Queries 365 Q 594 TPT 366 TTT 365 TLD 0 Score 0
The following config which is present in the tpcds_variables.sh file was used to run the GPDSB test that uses five concurrent users along with 3 TB of data load.
# benchmark options GEN_DATA_SCALE="3000" MULTI_USER_COUNT="5" # step options RUN_COMPILE_TPCDS="true" RUN_GEN_DATA="true" RUN_INIT="true" RUN_DDL="true" RUN_LOAD="true" RUN_SQL="true" RUN_SINGLE_USER_REPORT="true" RUN_MULTI_USER="true" RUN_MULTI_USER_REPORT="true" RUN_SCORE="true"
After running the GPDSB test, an overall score of 40 was achieved as follows:
[root@bdldgpmtmp01a TPC-DS]# tail -f tpcds.log Scale Factor 3000 Load 1521 Analyze 143 1 User Queries 6664 Concurrent Queries 38549 Q 1485 TPT 33320 TTT 38549 TLD 76 Score 40
The test was also carried out with a varying number of concurrent users from one to five for a 3 TB load. The following graph shows the performance scale linearity while running the tests:
Figure 7. GPDSB test for 1 to 5 concurrent users
The following Global User Configuration (GUC) settings are applied during the test run of 3 TB tests.
gpconfig -c gp_interconnect_queue_depth -v 16 gpconfig -c gp_interconnect_snd_queue_depth -v 16 gpconfig -c gp_resource_manager -v queue gpconfig -c gp_vmem_protect_limit -v 45000 gpconfig -c max_statement_mem -v 9GB gpconfig -c statement_mem -v 9GB gpconfig -c gp_resqueue_memory_policy -v auto gpconfig -c gp_resqueue_priority_cpucores_per_segment -v 16 gpconfig -c runaway_detector_activation_percent -v 100 gpconfig -c optimizer_enable_associativity -v on
During the 3 TB GPDSB test run, the host CPU and VM CPU were up to 75-80% utilized as shown in the following figures.
Figure 8. Host CPU during GPDSB run for 3 TB
Figure 9. VM CPU during GPDSB run for 3 TB
The following figure shows the PowerFlex GUI during the 3 TB GPDSB run. An average of 4 GB/s was achieved with close to 14 K IOPS.
Figure 10. PowerFlex GUI during the 3 TB GPDSB test run