All our systems included NVIDIA accelerators. The following figure shows all our MLPerf Training v2.0 submissions.
The graph shows all the submissions to MLPerf Training v2.0 by Dell Technologies. These submissions span across different problem areas such as image classification, medical image segmentation, lightweight object detection, heavy weight object detection, speech recognition, natural language processing, and recommendation across PowerEdge R750xa and XE8545 servers and Dell DSS 8440 servers.
Dell Technologies had about 16 percent of the closed division submissions, which translates to 42 results out of 258 closed results. While we not only have the highest number of submissions, we are also the only submitter having submitted on different host operating systems. These submissions are important to showcase the flexibility and performance of different deep learning frameworks as operating systems can sometimes be a challenge to run the workload. On Dell servers, running different operating systems is not an issue. Also, differences in the operating systems do not necessarily lead to differences in performance.
We have also taken other winning titles that demonstrate outstanding performance. For instance, we have the best score on the newly introduced RetinaNet model with a four-GPU NVIDIA A100-PCIe system.
Our results use a different combination of servers and accelerators to showcase a wide variety of system combinations to allow our customers to have more datapoints. These results are extracted from 12 different systems, also referred to as system under test (SUT).
Besides chipset makers like NVIDIA, Google, Graphcore, and Intel, Dell Technologies is the only OEM submitter to contribute many on-premises multiple-node results. Our multinode submissions have grown in number: an important datapoint to note as the requirement to train large models has been growing rapidly. During our submissions, we have shown different container scaling methodologies including Slurm with Docker and Slurm with Singularity so that customers can choose what suits them best in their data center environment.
Our multinode results demonstrate nearly linear or linear scaling performance. This scaling allows our customers to reach faster time to value on their workloads.