Configuration details

Thank you for your feedback!

The following table provides details about the PowerEdge R750xa server configuration and software environment for the MLPerf Inference v1.0 submission:

Table 1. Test bed configuration

Component	Description
Processor	2 x Intel Xeon Gold 6338 CPU 32 Cores @ 2.00 GHz
Memory	256 GB (16 GB 3200 MT/s * 16)
Local disk	2 x 1.8 TB SSD (No RAID)
Operating system	CentOS 8.2.2004
GPU	4 x NVIDIA A100-PCIe-40G
CUDA Driver	460.32.03
Other software versions	NVIDIA TensorRT 7.2.3, NVIDIA CUDA 11.1, NVIDIA cuDNN 8.1.1, Driver 460.32.03, NVIDIA DALI 0.30.0, NVIDIA Triton 21.02
System profiles	Performance
ECC on GPU	ON
MLPerf v1.0 System ID	R750xa_A100-PCIE-40GBx4_TRT

The PowerEdge R750xa server is designed specifically for accelerated workloads, which makes it ideal for cutting edge machine learning models, high-performance computing (HPC), and GPU virtualization. It uses Ice Lake processors and supports up to four double-wide GPUs. The PowerEdge R750xa server supports all PCIe data center GPUs in the PowerEdge portfolio, including the NVIDIA and AMD GPU product lines.

The PowerEdge R750xa server has undergone NVIDIA’s comprehensive certification program and is NVIDIA-Certified for enterprise AI (https://www.nvidia.com/en-us/data-center/products/certified-systems/). The PowerEdge R750xa server supports the newest NVIDIA GPUs (such as A100, A40, A30, and A10 GPUs) and Gen 3 GPUs (such as M10 and T4 GPUs). The PowerEdge R750xa server is more flexible; it can support two or four GPUs and does not require the data center administrator to keep four GPUs. The server also supports NVIDIA NVLink Bridge.

The PowerEdge R750xa server also supports multiinstance GPUs (MIGs), such as the NVIDIA A100 GPU. This feature enables the user to provision the A100 GPU into a maximum of seven individual instances. Each instance can be assigned to a different user, workload, or application that helps to increase GPU utilization.

The PowerEdge R750xa server is designed to be future proof. The GPUs are placed in the front of the server, together with up to eight storage drives, to enable higher airflow through the server. This configuration makes it easier to support GPUs with higher thermal design power (TDP) as they are released.

Ice Lake is the successor to Intel’s Cascade Lake processor. The Ice Lake processor has up to 40 cores, six terabytes of system memory per socket, up to eight channels of DDR4-3200 memory per socket, and up to 64 PCIe Gen 4 lanes per socket. It is also the first CPU that Intel has released to support PCIe Gen4, which doubles the bit rate of Gen3, which is ideal for transferring data between CPUs and GPUs.

The following table lists the Ice Lake specifications:

Table 2. Intel Xeon Gold 6338 specifications

Component	Description
Product collection	3rd Generation Intel Xeon Scalable Processors
Code name	Ice Lake
Processor name	Gold 6338
Status	Launched
Number of CPU cores	32
Number of threads	64
Processor base frequency	2.0 GHz
Maximum turbo speed	3.20 GHz
Cache L3	48 MB
Memory type	DDR4-2933
ECC memory supported	Yes
PCI Express revision	4.0
Maximum number of PCIe lanes	64

The following figure shows the Ice Lake processor:

A picture containing text, electronics, case, opened

Description automatically generated

Figure 2. Ice Lake processor

Four NVIDIA A100 PCIe GPUs were used with the PowerEdge R750xa server for the MLPerf Inference v1.0 submission. The A100 GPU has many features that optimize inference workloads. It supports acceleration of different precisions (from FP32 to INT4) enables structural sparsity and delivers orders of magnitude of performance gains.

The following table lists the NVIDIA A100 PCIe GPU specifications:

Table 3. NVIDIA A100 PCIe GPU specifications

Component	Description
GPU architecture	NVIDIA Ampere
NVIDIA Tensor cores	432
NVIDIA CUDA cores	6912
Single precision	19.5 TFLOPS
Double precision	9.7 TFLOPS
INT8	1248 TOPS
INT4	2496 TOPS
GPU memory	40 GB HBM2
ECC	Yes
Memory bandwidth	1,555 GB/s
Interconnect interface	PCIe Gen4: 64 GB/s
Form factor	PCIe
Thermal solution	Passive
Compute APIs	NVIDIA CUDA, DirectCompute, OpenCL, OpenACC
MIG	Various instance sizes with up to 7 MIGs @ 5 GB
TDP	250 watts

The following figure shows the NVIDIA A100 PCIe GPU:

A picture containing electronics

Description automatically generated

Figure 3. NVIDIA A100 PCIe GPU

Your Browser is Out of Date

Configuration details

Configuration details