The PowerEdge R750xa server delivers optimal performance on different deep learning inference tasks such as image classification, object detection, medical image segmentation, speech to text, language processing, and recommendation. Intel CPUs and NVIDIA GPUs enable effective delivery of high throughputs on different deployment scenarios. Higher cores, higher bandwidth, and more PCIe lanes offered by Intel Xeon scalable processors enable expedited performance, even while running with an accelerator such as the NVIDIA A100 GPU, in which the GPU processes all the inference. The effective combination of the Ice Lake CPUs, NVIDIA A100 GPUs, and the PowerEdge R750xa server is an excellent choice for running inference workloads.