Running an at-scale benchmark

Thank you for your feedback!

To run benchmarks at scale, it is wise to create scripts that can be easily used to start and stop jobs across multiple compute nodes. Example scripts have been provided to streamline this process:
- run_aosp.sh is used as a single compute node stand-alone test
- run_aosp_slurm.sh is used to start a multi-job multi-node scale test
Both of these scripts are in the Appendix. Place these scripts in the working directory on the head node.
Remember that you can use sbatch and srun in these scripts. A script will typically contain one or more srun commands to launch parallel tasks. The sbatch command is used to submit a job script for later execution; srun is used to submit a job in real time. Within a script file, you can use the sbatch command to submit run_aosp.sh to one or more nodes in the cluster. Multiple run_aosp.sh jobs can be run in any node specifying cores to use.
You can adjust the Job Array number or CPU cores to use for the AOSP build, depending on the number of nodes and CPU cores across the cluster. If you allocate more CPU cores to --cpus-per-task, entire jobs will finish faster.
In the run_aosp_slurm.sh example, 24 Job Array tasks will be scheduled independently of one another. A single task will allocate 24 CPU cores on a single node (--cpus-per-task=24).
In this case, 576 CPU cores would be required to run all tasks in parallel. Because 288 CPU cores were available in this example, the Job Array tasks were split into 2 groups of 12 (12 tasks * 24 cores). The first group of 12 will start and the second group of 12 will be queued and pending until there are enough CPU cores available to the job scheduler.
When the initial 12 tasks have finished, the other 12 tasks start in parallel but independently of each other. This type of workflow is common with EDA, for example, when applying the same program to a list of files.
If more CPU cores are available, increase the Job Array number (#SBATCH –array) or allocate more CPU cores to a single job (--cpus-per-task) in the run_aosp_slurm.sh example.
There are many factors to consider while running an at-scale benchmark. It is important to monitor the entire infrastructure and identify any bottlenecks quickly within the application, job scheduler, network, and storage.

Your Browser is Out of Date

Running an at-scale benchmark

Running an at-scale benchmark