The following table provides details about the PowerEdge R750xa server configuration and software environment for the MLPerf Inference v1.0 submission:
2 x Intel Xeon Gold 6338 CPU 32 Cores @ 2.00 GHz
256 GB (16 GB 3200 MT/s * 16)
2 x 1.8 TB SSD (No RAID)
4 x NVIDIA A100-PCIe-40G
Other software versions
NVIDIA TensorRT 7.2.3, NVIDIA CUDA 11.1, NVIDIA cuDNN 8.1.1, Driver 460.32.03, NVIDIA DALI 0.30.0, NVIDIA Triton 21.02
ECC on GPU
MLPerf v1.0 System ID
The PowerEdge R750xa server is designed specifically for accelerated workloads, which makes it ideal for cutting edge machine learning models, high-performance computing (HPC), and GPU virtualization. It uses Ice Lake processors and supports up to four double-wide GPUs. The PowerEdge R750xa server supports all PCIe data center GPUs in the PowerEdge portfolio, including the NVIDIA and AMD GPU product lines.
The PowerEdge R750xa server has undergone NVIDIA’s comprehensive certification program and is NVIDIA-Certified for enterprise AI (https://www.nvidia.com/en-us/data-center/products/certified-systems/). The PowerEdge R750xa server supports the newest NVIDIA GPUs (such as A100, A40, A30, and A10 GPUs) and Gen 3 GPUs (such as M10 and T4 GPUs). The PowerEdge R750xa server is more flexible; it can support two or four GPUs and does not require the data center administrator to keep four GPUs. The server also supports NVIDIA NVLink Bridge.
The PowerEdge R750xa server also supports multiinstance GPUs (MIGs), such as the NVIDIA A100 GPU. This feature enables the user to provision the A100 GPU into a maximum of seven individual instances. Each instance can be assigned to a different user, workload, or application that helps to increase GPU utilization.
The PowerEdge R750xa server is designed to be future proof. The GPUs are placed in the front of the server, together with up to eight storage drives, to enable higher airflow through the server. This configuration makes it easier to support GPUs with higher thermal design power (TDP) as they are released.
Ice Lake is the successor to Intel’s Cascade Lake processor. The Ice Lake processor has up to 40 cores, six terabytes of system memory per socket, up to eight channels of DDR4-3200 memory per socket, and up to 64 PCIe Gen 4 lanes per socket. It is also the first CPU that Intel has released to support PCIe Gen4, which doubles the bit rate of Gen3, which is ideal for transferring data between CPUs and GPUs.
The following table lists the Ice Lake specifications:
3rd Generation Intel Xeon Scalable Processors
Number of CPU cores
Number of threads
Processor base frequency
Maximum turbo speed
ECC memory supported
PCI Express revision
Maximum number of PCIe lanes
The following figure shows the Ice Lake processor:
Four NVIDIA A100 PCIe GPUs were used with the PowerEdge R750xa server for the MLPerf Inference v1.0 submission. The A100 GPU has many features that optimize inference workloads. It supports acceleration of different precisions (from FP32 to INT4) enables structural sparsity and delivers orders of magnitude of performance gains.
The following table lists the NVIDIA A100 PCIe GPU specifications:
NVIDIA Tensor cores
NVIDIA CUDA cores
40 GB HBM2
PCIe Gen4: 64 GB/s
NVIDIA CUDA, DirectCompute, OpenCL, OpenACC
Various instance sizes with up to 7 MIGs @ 5 GB
The following figure shows the NVIDIA A100 PCIe GPU: