
The PowerEdge XE8545: Performance Summary
Download PDFMon, 16 Jan 2023 19:50:52 -0000
|Read Time: 0 minutes
AI Infrastructure for Computing Without Compromise
Summary
This document is a brief summary of the performance advantages that customers can gain when using the PowerEdge XE8545 acceleration server. All performance and characteristics discussed are based on performance characteristics conducted in the Americas Data Center (CET) labs. Results accurate as of 3/15/2021. Ad Ref #G21000042
Dell’s Latest Acceleration Offering
The PowerEdge XE8545 is Dell EMC’s response to the needs of high-performance machine learning customers who immediately want all the innovation and horsepower provided by the latest NVIDIA GPU technology, without the need to make major cooling-
related changes to their data center. Its specifically air-cooled design provides delivery of four A100 40GB/400W GPUs with a low-latency, switchless SMX4 NVLink interconnect, while letting the data center maintain an energy efficient 35°C. It also has an 80GB/500W GPU option that has been shown to deliver 13-15% higher performance than 400W GPUs at only a slightly lower ambient input temperature (28°C).
500W Advantage Delivers More Performance
- The XE8545 can unleash 500W power with its 80GB GPU to outperform any 400W-based competition by 13-15%
Unlike competitors, Dell worked with NVIDIA early in the design process to ensure that the XE8545 could run at 500 Watts of power when using the high capacity 80GB A100 GPUs – and still be air-cooled. This 80GB/500W GPU option allows the XE8545 to drive harder and derive more performance from each of the GPUs. Using the common industry benchmark ResNet50 v1.5 model to measure image classification speed with a standard batch size, the 500W GPU took 67.78 minutes to train, compared to 73.32 minutes for the 400W GPU – 7.56% faster. And when batch size is doubled, it results in up to 13- 15% better
performance! When speed of results is a customer’s primary concern, the XE8545 can deliver the power needed to get those results faster.
Training – A Generational Leap in Performance
- XE8545 trains ResNet50 image classification to top-quality accuracy in less than HALF (1/2) the time of the previous generation PowerEdge Systems
It is clear from the chart above that an XE845 with 40GB memory is more than twice as fast as the previous generation C4140 when training an image classification model – in fact, faster than two C4140s running in parallel! And the 80GB GPU option is even faster! This is a great illustration of the combined power of the new technologies packed into the XE8545 – the latest NVIDIA GPUS, the latest AMD CPUs and the latest generation of PCIe IO fabric. Further gains in performance can be achieved by workloads that take advantage of the improvements in how the A100 performs the matrix multiplication involved in machine learning– by better accounting for “sparsity”. That is, the occurrence of many zeros in the matrix, that previously resulted in lots of time-consuming “multiplying-by-zero” operations that had no actual effect on the final result.
And as with all operations for the XE8545, it delivers the very top-level performance using only air-cooling. It does not require liquid cooling.
Inference Analysis of Images
- XE8545 can analyze over 150k images per second – that’s 1.46x more images per second on each SXM4 A100 GPU compared to previous generation PowerEdge
Inference tends to scale linearly – as there is no peer-to-peer GPU communication involved - and the XE8545 has proven to have exceptional linear scalability. So, it is not surprising that the XE8545 produces excellent high-performance inference results. As with training, the 80GB/500W A100 GPU has a performance edge - 10% faster than the 400W GPU (at a proportional power increase).
MIG - Multi-Instance GPUs - 7X Faster for Inferencing
The innovative Multi-Instance GPU (MIG) technology introduced with the A100 GPU allows the XE8545 to partition each A100 GPU into as many as seven “slices”, each fully isolated with its own high-bandwidth memory, cache, and compute cores. So, if fully utilized, an XE8545 server can be running 28 separate high- performance instances of inferencing. Each of those instances has been determined by NVIDIA to provide performance equivalent to the previous generation V100. So, an A100 GPU can be thought of as 7 times faster than the previous generation – specifically for inferencing, where peer-to-peer communication does not come into play.
NVIDIA Certified - Faster Deployment of Machine Learning Environments
The XE8545 has undergone NVIDIA’s comprehensive certification program for Datacenter AI: NVIDIA GPU Cloud (NGC). It is now certified to run at the latest Gen4 networking speeds and can take advantage of the NGC catalog that hosts frameworks and containers for the top AI, ML and HPC software, already tuned, tested and optimized. With NGC certification data centers can quickly and easily deploy machine learning environments with confidence and get results faster. For more details on NVIDIA certified systems here.
A New Server - New Technologies - New Levels of Accelerated Performance
The PowerEdge XE8545 introduces the latest industry technologies in a combination that delivers the kind of high-performance, accelerated computing that can handle even the most demanding Artificial Intelligence and Machine Learning workloads or scientific high-performance computing analysis. It provides the highest levels of power and performance in an air-cooled environment, simplifying operational continuity in enterprise data centers.
Related Documents

Accelerating AI Inferencing with Dell PowerEdge XE9680: A Performance Analysis
Tue, 28 Mar 2023 23:05:16 -0000
|Read Time: 0 minutes
Executive Summary
The Dell PowerEdge XE9680 is a high-performance server designed and optimized to enable uncompromising performance for artificial intelligence, machine learning, and high-performance computing workloads. Dell PowerEdge is launching our innovative 8-way GPU platform with advanced features and capabilities.
- 8x NVIDIA H100 80GB 700W SXM GPUs or 8x NVIDIA A100 80GB 500W SXM GPUs
- 2x Fourth Generation Intel® Xeon® Scalable Processors
- 32x DDR5 DIMMs at 4800MT/s
- 10x PCIe Gen 5 x16 FH Slots
- 8x SAS/NVMe SSD Slots (U.2) and BOSS-N1 with NVMe RAID
This Direct from Development (DfD) tech note provides valuable insights on AI inferencing performance for the recently launched PowerEdge XE9680 server by Dell Technologies.
Testing
To evaluate the inferencing performance of each GPU option available on the new PowerEdge XE9680, the Dell CET AI Performance Lab, and the Dell HPC & AI Innovation Lab selected several popular AI models for benchmarking. Additionally, to provide a basis for comparison, they also ran benchmarks on our last-generation PowerEdge XE8545. The following workloads were chosen for the evaluation:
- BERT-large (Bidirectional Encoder Representations from Transformers) – Natural language processing like text classification, sentiment analysis, question answering, and language translation
- XE8545 Batch Size 512
- XE9680-A100 Batch Size 512
- XE9680-H100 Batch Size 1024
- ResNet (Residual Network) – Image recognition. Classify, object detection, and segmentation
- XE8545 Batch Size 2048
- XE9680-A100 Batch Size 2048
- XE9680-H100 Batch Size 2048
- RNNT (Recurrent Neural Network Transducer) – Speech recognition. Converts audio signal to words
- XE8545 Batch Size 2048
- XE9680-A100 Batch Size 2048
- XE9680-H100 Batch Size 2048
- RetinaNET – Object detection in images
- XE8545 Batch Size 16
- XE9680-A100 Batch Size 32
- XE9680-H100 Batch Size 16
Performance
The results are remarkable! The PowerEdge XE9680 demonstrates exceptional inferencing performance!

+300%: PowerEdge XE9680 NVIDIA A100 to H100 performance(1)

+700%: When compared to PowerEdge XE8545(2)
Comparing the NVIDIA A100 SXM configuration with the NVIDIA H100 SXM configuration on the same PowerEdge XE9680 reveals up to a 300% improvement in inferencing performance! (1)
Even more impressive is the comparison between the PowerEdge XE9680 NVIDIA H100 SXM server and the XE8545 NVIDIA A100 SXM server, which shows up to a 700% improvement in inferencing performance! (2)
Here are the results of each benchmark. In all cases, higher is better.
With exceptional AI inferencing performance, the PowerEdge XE9680 sets a high benchmark for today’s and tomorrow's AI demands. Its advanced features and capabilities provide a solid foundation for businesses and organizations to take advantage of AI and unlock new opportunities.
Contact your account executive or visit www.dell.com to learn more.
Table 1. Server configuration
(1) Testing conducted by Dell in March of 2023. Performed on PowerEdge XE9680 with 8x NVIDIA H100 SXM5-80GB and PowerEdge XE9680 with 8x NVIDIA A100 SXM4-80G. Actual results will vary.
(2) Testing conducted by Dell in March of 2023. Performed on PowerEdge XE9680 with 8x NVIDIA H100 SXM5-80GB and PowerEdge XE8545 with 4x NVIDIA A100-SXM-80GB. Actual results will vary.

Unlocking Machine Learning with Dell PowerEdge XE9680: Insights into MLPerf 2.1 Training Performance
Tue, 28 Mar 2023 23:05:15 -0000
|Read Time: 0 minutes
Executive Summary
The Dell PowerEdge XE9680 is a high-performance server designed and optimized to enable uncompromising performance for artificial intelligence, machine learning, and high-performance computing workloads. Dell PowerEdge is launching our innovative 8-way GPU platform with advanced features and capabilities.
- 8x NVIDIA H100 80GB 700W SXM GPUs or 8x NVIDIA A100 80GB 500W SXM GPUs
- 2x Fourth Generation Intel® Xeon® Scalable Processors
- 32x DDR5 DIMMs at 4800MT/s
- 10x PCIe Gen 5 x16 FH Slots
- 8x SAS/NVMe SSD Slots (U.2) and BOSS-N1 with NVMe RAID
This tech note, Direct from Development (DfD), offers valuable insights into the performance of the PowerEdge XE9680 using MLPerf 2.1 benchmarks from MLCommons.
Testing
MLPerf is a suite of benchmarks that assess the performance of machine learning (ML) workloads, with a focus on two crucial aspects of the ML life cycle: training and inference. This tech note specifically delves into the training aspect of MLPerf.
The Dell CET AI Performance and the Dell HPC & AI Innovation Lab conducted MLPerf 2.1 Training benchmarks using the latest PowerEdge XE9680 equipped with 8x NVIDIA A100 80GB SXM GPUs. For comparison, we also ran these tests on the previous generation PowerEdge XE8545, equipped with 4x NVIDIA A100 80GB SXM GPUs. The following section presents the results of our tests. Please note that in the figure below, a lower number indicates better performance and the results have not been verified by MLCommons.
Performance
Figure 1. MLPERF 2.1 Training
Our latest server, the PowerEdge XE9680 with 8x NVIDIA A100 80GB SXM GPUs, delivers on average twice the performance of our previous-generation server. This translates to faster AI model training, enabling models to be trained in half the time! With the PowerEdge XE9680, you can accelerate your AI workloads and achieve better results, faster than ever before. Contact your account executive or visit www.dell.com to learn more.
Table 1. Server configuration
(1) Testing conducted by Dell in March of 2023. Performed on PowerEdge XE9680 with 8x NVIDIA A100 SXM4-80GB and PowerEdge XE8545 with 4x NVIDIA A100-SXM-80GB. Unverified MLPerf v2.1 BERT NLP v2.1, Mask R-CNN object detection, heavy-weight v2.1 COCO 2017, 3D U-Net image segmentation v2.1 KiTS19, RNN-T speech recognition v2.1 rnnt Training. Result not verified by MLCommons Association. The MLPerf name and logo are trademarks of MLCommons Association in the United States and other countries. All rights reserved. Unauthorized use is strictly prohibited. See www.mlcommons.org for more information.” Actual results will vary.