This whitepaper introduced and provided an overview of MLPerf Training v3.0 for the new generation of Dell servers. The key takeaways include:
- This round of submissions introduced new Dell servers compared to previous rounds.
- We saw significant performance gains with the new generation servers and NVIDIA GPUs.
- The Dell PowerEdge XE9680 server delivers the highest training performance with NVIDIA HGX H100 GPUs across different workloads in the benchmark such as image classification, medical image segmentation, lightweight and heavy weight object detection, speech recognition, NLP, and recommendation.
- Dell PowerEdge XE9860, XE8640, and R760xa servers are excellent choices for deep learning training workloads as they are highly scalable horizontally. More information about multinode scaling will be available in a new blog.
- Our results include many submissions to the Closed division. This large number of submissions helps customers view data in different ways to decide what suits their workloads for like-to-like comparison between different vendors and OEMs.
- Submissions to MLPerf Training 3.0 are continuously increasing; there are about 30 percent more results compared to the last round of submissions.
- Hardware and software improvements can render a significant reduction in time to convergence for different types of deep learning training workloads.
- Customers can leverage Dell Technologies to fuel their AI transformation to deliver high performance deep learning training including workloads such as generative AI. For example, Dell Technologies has partnered with NVIDIA to release Project Helix.