Home > Workload Solutions > Data Analytics > Guides > Reference Architecture—Multicloud Data Analytics with Dell Technologies Powered by Starburst > Test setup
In this section, we describe the test set up, including the queries we have used, how we generated data and a summary of the methodology we followed.
We use the industry-standard TPC-DS test suite to benchmark Starburst performance. The TPC-DS benchmark data is modeled on the decision support functions of a retail product supplier and consists of seven (7) fact tables and seventeen (17) dimensions. We use the Scale Factor of 1000 which corresponds to 1TB of data. TPC-DS consists of 99 queries, for more information on how to generate the 99 queries, see B TPC-DS Queries Generation and Required Modifications.. The 99 queries are divided into four broad classes:
TPC-DS data set is generated using TPC-DS dsdgen utility in CSV format The data set is stored in an ECS Filesystem enabled bucket, the bucket is mounted as an NFS mount to the host where dsdgen utility is run during the creation of the data set. The Starburst Create Table as Select (CTAS) utility is used to convert the generated CSV data and write it to ECS in Iceberg table format and Parquet file format using the Iceberg connector. The data is generated in both partitioned and non-partitioned layout. Partitioning is done by date on all the seven (7) fact tables. Further partitioned data is compacted for higher read performance. For more information on how to generate these data, see Error! Reference source not found. Error! Reference source not found..
The Apache JMeter test suite is used to execute benchmarking tests. Starburst Iceberg connector is used to connect to TPC-DS data in ECS.