PowerEdge R750xa and NVIDIA H100 PCIe GPU: 66% Increase in HPC Performance per Watt
Download PDFMon, 16 Jan 2023 19:49:21 -0000
|Read Time: 0 minutes
Summary
Using the PowerEdge R750xa, the Dell HPC & AI Innovation Lab compared performance of the new NVIDIA H100 PCIe 310 W GPU to the previous- generation NVIDIA A100 PCIe GPU, using the supercomputer benchmark HPL. Results showed:
- 66% increase in performance per watt
- 67% increase in raw performance (TFLOPS), using four GPUs
The Dell PowerEdge R750xa, powered by 3rd Gen Intel Xeon Scalable processors, is a dual-socket/2U rack server that delivers outstanding performance for the most demanding emerging and intensive GPU workloads. It supports 8 channels per CPU and up to 32 DDR4 DIMMs with speeds up to 3200 MT/s. In addition, the PowerEdge R750xa supports PCIe Gen 4 and up to 8 SAS/SATA SSDs or NVMe drives. The PowerEdge R750xa, the one PowerEdge portfolio platform that supports all the PCIe GPUs, is the ideal server for virtualization environments and workloads such as high performance computing and AI-ML/DL training and inferencing. The PowerEdge R750xa includes all the core benefits of PowerEdge: serviceability, consistent systems management with iDRAC, and the latest in extreme acceleration.
The new NVIDIA H100 PCIe GPU is optimal for delivering the fastest business outcomes with the latest accelerated servers in the Dell PowerEdge portfolio, starting with the R750xa. The PowerEdge R750xa boosts workloads to new performance heights with GPU and accelerator support for demanding workloads, including enterprise AI. With its enhanced, air-cooled design and support for up to four NVIDIA double-width GPUs, the PowerEdge R750xa server is purpose-built for optimal performance for the entire spectrum of HPC, AI-ML/DL training, and inferencing workloads.
Next-generation GPU performance analysis
The Dell HPC & AI Innovation Lab compared the performance of the new NVIDIA H100 PCIe 310 W GPU to the last Gen A100 PCIe GPU in the Dell PowerEdge R750xa. The team used HPL, a popular computing benchmark often used to evaluate the performance of supercomputers on the TOP500 list. This comparison included HPL performance and server power consumption throughout the benchmark. Here are the results:
Performance/watt
The performance per watt calculation is the HPL benchmark score divided by the average server power over the duration of the HPL benchmark. The PowerEdge R750xa with the NVIDIA H100 PCIe GPUs delivered a 66% increase in performance/watt compared to the PowerEdge R750xa with the NVIDIA A100 PCIe GPUs, as shown in the following figure.
PowerEdge R750xa - HPL Benchmark and Server Power
Figure 1. Performance/watt comparison
HPL benchmark performance
Figure 2 shows the raw HPL performance of each configuration. The PowerEdge R750xa with four NVIDIA H100 PCIe GPUs achieved a 67% increase in TFLOPS compared to the configuration with four NVIDIA A100 PCIe GPUs.
Figure 2. Raw performance comparison
Server power
Figure 3 shows the server power over the duration of the HPL benchmark. The NVIDIA H100 PCIe GPU configuration delivered better performance with slightly lower server power and finished the workload faster.
Figure 3. HPL server power
Configuration information
The following table shows the two test configurations.
Table 1. R750xa test configurations
| R750xa with four NVIDIA H100 | R750xa with four NVIDIA A100 |
Server | PowerEdge R750xa | |
CPU | 2 x Intel Xeon Gold 6338 CPU | |
Memory | 512 GB system memory | |
Storage | 1 x 3.5T SSD | |
BIOS/iDRAC | 1.9.0/6.0.0.0 | |
HPL version | HPL for H100 (Alpha version, results subject to change) | |
Operating system | Ubuntu 20.04 LTS | |
GPU | NVIDIA H100-PCIe-80GB (310 W) | NVIDIA A100-PCIe-80GB (300 W) |
Driver | CUDA 11.8 | CUDA 11.8 |
Conclusion
Using the PowerEdge R750xa, the Dell HPC & AI Innovation Lab compared the performance of the new NVIDIA H100 PCIe 310 W GPU to the previous-generation NVIDIA A100 PCIe GPU. HPL benchmark results showed a 66 percent increase in performance/watt and a 67 percent increase in TFLOPS.
The PowerEdge R750xa supports up to four NVIDIA H100 PCIe GPUs and is available with new orders or as a customer upgrade kit for existing deployments. To learn more, reach out to your account executive or visit www.dell.com.
References