For comparison, we consider the PowerEdge R740 server results that were submitted with 2nd Generation Intel Xeon scalable processors.
The following table lists the PowerEdge R740 server configuration:
2 x Intel Xeon Gold 6248R @ 3.00 GHz
384 GB (16 GB 3200 MT/s * 24)
2x 1.8 TB SSD (No RAID)
3x NVIDIA A100-PCIe-40G
Other software versions
TensorRT 7.2.3, CUDA 11.1, cuDNN 8.1.1, Driver 460.32.03, DALI 0.30.0, Triton 21.02
ECC on GPU
MLPerf v1.0 System ID
The same A100-PCIe GPUs are used. The new PowerEdge R750xa server supports up to four A100 PCIe GPUs and the previous generation PowerEdge R740 server supports up to three PCIe A100 GPUs. This characteristic is the main reason that the PowerEdge R750xa server outperforms the PowerEdge R740 server significantly in overall performance that a single system can deliver. See the following figure for performance results:
For more uniform performance, we divide the whole system number by the number of GPUs in the server. The following figure provides an example of the per GPU numbers on DLRM models:
The figure shows that the PowerEdge R750xa server is 5.8 percent faster in the Offline scenario, and approximately 37 percent faster in the Server scenario. The DLRM model provides extremely high throughput, with thousands of queries occurring per second. The CPU and PCIe provide a significant impact on the server performance. These results demonstrate that for some high throughput models like DRLM, higher performance gains can be realized using the 3rd Gen Xeon scalable processors even while using same accelerators. Furthermore, PCIe Gen 4 also helps to improve performance in models like the DLRM Server scenario.