Server comaprison

Thank you for your feedback!

For comparison, we consider the PowerEdge R740 server results that were submitted with 2nd Generation Intel Xeon scalable processors.

The following table lists the PowerEdge R740 server configuration:

Table 4. PowerEdge R740 configuration

Component	Description
Processor	2 x Intel Xeon Gold 6248R @ 3.00 GHz
Memory	384 GB (16 GB 3200 MT/s * 24)
Local disk	2x 1.8 TB SSD (No RAID)
Operating system	CentOS 8.2.2004
GPU	3x NVIDIA A100-PCIe-40G
CUDA Driver	460.32.03
Other software versions	TensorRT 7.2.3, CUDA 11.1, cuDNN 8.1.1, Driver 460.32.03, DALI 0.30.0, Triton 21.02
System profiles	Performance
PCIe	Gen 3
ECC on GPU	ON
MLPerf v1.0 System ID	R740_A100-PCIe-40GBx3_TRT

The same A100-PCIe GPUs are used. The new PowerEdge R750xa server supports up to four A100 PCIe GPUs and the previous generation PowerEdge R740 server supports up to three PCIe A100 GPUs. This characteristic is the main reason that the PowerEdge R750xa server outperforms the PowerEdge R740 server significantly in overall performance that a single system can deliver. See the following figure for performance results:

Chart, bar chart

Description automatically generated

Figure 6. DLRM 99 percent and 99.9 percent Offline and Server performance for PowerEdge R750xa and R740 servers

For more uniform performance, we divide the whole system number by the number of GPUs in the server. The following figure provides an example of the per GPU numbers on DLRM models:

Chart, bar chart

Description automatically generated

Figure 7. DLRM 99 percent and 99.9 percent Offline and Server performance for PowerEdge R750xa and R740 servers

The figure shows that the PowerEdge R750xa server is 5.8 percent faster in the Offline scenario, and approximately 37 percent faster in the Server scenario. The DLRM model provides extremely high throughput, with thousands of queries occurring per second. The CPU and PCIe provide a significant impact on the server performance. These results demonstrate that for some high throughput models like DRLM, higher performance gains can be realized using the 3rd Gen Xeon scalable processors even while using same accelerators. Furthermore, PCIe Gen 4 also helps to improve performance in models like the DLRM Server scenario.

Your Browser is Out of Date

Server comaprison

Server comaprison