Today marks the unveiling of MLPerf v4.1 Inference results, an emergent industry benchmark for AI systems. These benchmarks assess the system-level performance of state-of-the-art hardware and software stacks across a wide range of AI tasks, including image classification, object detection, natural language processing, speech recognition, recommender systems, medical image segmentation, large language model (LLM) question answering, and text-to-image generation.

As a founding member of MLCommons™, Dell Technologies consistently participates in MLPerf Inference benchmarks since their inception. Our results in this round are exceptional, emphasizing our commitment to delivering outstanding system performance for the most demanding AI workloads, including generative AI.

What's New in Inference v4.1?

Inference v4.1 and Dell's submission introduce several exciting updates:

Introduction of the Mixtral 8 x 7B benchmark, expanding the suite of LLM tests with this MoE Model.
Submission across various Dell PowerEdge XE and R platforms with new GPUs, showcasing the versatility of our hardware solutions.
New results with AMD Instinct MI300X accelerators, and NVIDIA H200 and H100 NVL Tensor Core GPUs that demonstrate our commitment to cutting-edge accelerator technologies. A total of 115 NVIDIA GPU-based results were submitted across all the benchmarks. Four AMD accelerator-based results with the Llama 2 model and 11 CPU-based results were submitted.
Results across different host operating systems, highlighting our flexibility in supporting diverse environments.
Intel-based CPU-only results, demonstrating performance across diverse compute architectures.

Highlights

Dell's submissions for MLPerf Inference v4.1 showcase a wide range of Dell PowerEdge servers equipped with cutting-edge GPUs and CPUs.

Performance

The performance highlights include:

Dell PowerEdge XE9680 (NVIDIA H200 Tensor Core GPU configuration) server emerged as a leader compared to other systems using 8 x H200 700W GPUs, securing #1 positions in tasks such as ResNet Offline, BERT-99 Offline, GPT-J 99 and 99.9 Server, SDXL Offline, and Llama 2 Offline, including Llama 2 99.9 Offline.
Dell PowerEdge XE8640 and XE9640 (NVIDIA H100 Tensor Core GPU configuration) servers compared to other systems with 4 x H100 GPUs achieved a full sweep, securing #1 titles across all benchmarks and scenarios except DLRM v2 and BERT 99 Offline and BERT 99.9 Offline benchmarks in the Data Center (DC) suite making Dell 4 xH100 systems the best performing system overall.
Dell PowerEdge R760xa (NVIDIA H100 NVL GPU configuration) server compared to other systems with 4 x H100 NVL GPUs delivered exceptional results, capturing #1 titles in RetinaNet Offline, 3D-UNet 99 and 99.9 Offline, BERT 99 Offline and Server, BERT 99.9 Offline and Server, DLRMv2 99 and 99.9 Offline and Server, GPT-J 99, and SDXL Offline.
Dell PowerEdge R760xa (NVIDIA H100 Tensor Core GPU configuration) server compared to other systems with 4 x H100 NVL GPUs continued the success by achieving #1 positions in key benchmarks such as RetinaNet Offline, 3D-UNet 99 and 3D-UNet 99.9 Offline, BERT 99 Offline and Server, BERT 99.9 Offline and Server, DLRMv2 99 and 99.9 Offline and Server, GPTJ 99, and SDXL Offline.
Dell PowerEdge R760 (4 x NVIDIA L40S GPU configuration) server compared to other systems with NVIDIA L40s GPUs also made a strong showing, securing #1 in ResNet 50 and RetinaNet Offline, as well as BERT 99. These results were based on per GPU count.

System configurations

The Dell PowerEdge server configurations include:

Dell PowerEdge XE9680 server with:
- 8 x AMD Instinct MI300X accelerators
- 8 x NVIDIA Tensor Core H100-SXM-80GB GPUs
- 8 x NVIDIA Tensor Core H200-SXM-141GB GPUs
Dell PowerEdge XE9640 server with:
- 4 x NVIDIA H100-SXM-80GB GPUs
Dell PowerEdge XE9640 server with:
- 4 x NVIDIA H100-SXM-80GB GPUs
Dell PowerEdge XE8640 server with:
- 4 x NVIDIA H100-SXM-80GB GPUs
Dell PowerEdge R760xa server with:
- 4 x NVIDIA H100 NVL GPUs
- 4 x NVIDIA H100-PCIe-80GB GPUs
- 4 x NVIDIA L40S GPUs
Dell PowerEdge R760 server with:
- Intel Xeon Platinum 8592+ CPUs (CPU-only configuration)
Dell PowerEdge XR8620t server with:
- 1x NVIDIA L4 GPU (Edge configuration)

Real-world performance advantages

These benchmark results translate into significant real-world advantages for end users:

Accelerated AI workloads─The exceptional performance of our PowerEdge XE9680 and PowerEdge R760xa servers with NVIDIA H200 Tensor Core, and NVIDIA H100 NVL Tensor Core GPUs means that businesses can handle more complex AI models and larger datasets with unprecedented speed.
Efficient LLM processing─Strong results in LLM benchmarks, particularly with the PowerEdge XE9680 server using the latest GPUs, indicate that Dell servers are well equipped to handle demanding generative AI applications efficiently.
Flexible deployment option ─Our results across different operating systems demonstrate Dell's ability to support diverse IT infrastructures, enabling seamless AI integration into existing setups.
Edge AI capabilities─The PowerEdge XR8620t server's performance shows Dell's readiness to support AI inferencing at the edge, crucial for applications in retail, manufacturing, and smart cities.
Future-ready infrastructure─By excelling in cutting-edge benchmarks with the latest accelerators, we show that Dell servers are prepared for the next wave of AI innovations, protecting our customers' investments.

Conclusion

The MLPerf Inference v4.1 results demonstrate Dell Technologies' commitment to delivering high-performance AI infrastructure. Dell PowerEdge servers, equipped with the latest GPUs and optimized software stacks, have secured numerous top positions across various benchmarks. These results make them an excellent choice for both data center and edge inference deployments.

These results equip enterprises with valuable insights for making informed decisions about server performance and sizing, ultimately supporting their AI transformation journey. As the demand for AI computing continues to grow, especially in the realm of generative AI, Dell Technologies remains at the forefront, offering solutions that meet the evolving needs of our customers.

For more detailed information about our MLPerf Inference v4.1 results or to discuss how Dell Technologies can support your AI initiatives, contact your Dell representative.

MLCommons Results

https://mlcommons.org/en/inference-datacenter-41/

https://mlcommons.org/en/inference-edge-41/

The preceding graphs are MLCommons results for MLPerf IDs from 4.1-0013 to 4.1-0022 on the closed data center, 4.1-0023 on the closed edge.

Your Browser is Out of Date

Dell Technologies Delivers Outstanding Results in MLPerf™ Inference v4.1