This white paper shows the high throughput achieved for object detection when a Dell PowerEdge R7515 Server and NVIDIA A30 GPU work together to process multiple video streams. It follows a previous paper, Developing and Deploying Vision AI with Dell and NVIDIA Metropolis (delltechnologies.com), that describes in detail the development of the same RetinaNet object detection inference model but with a T4 GPU instead of the A30 used for the results presented here.
Deploying vision AI applications can provide businesses with insights and intelligence that can be transformational. NVIDIA A30 GPUs running on a Dell PowerEdge 7515 server provides exceptionally high object detection inference throughput which can be used as a part of such vision AI solutions. In this paper we again use the power of transfer learning from the NVIDIA TAO Toolkit to fine-tune a RetinaNet engine and deploy it for inference using DeepStream, an end-to-end intelligent video analytics pipeline. The optimized engine for the A30 produces more than 800 frames per second (fps) of object detection inference on the KITTI dataset test images we target. This total fps throughput can be applied against multiple video streams using NVIDIA’s Triton Inference software to orchestrate the flows within the DeepStream pipeline. Additionally, if instead of performing inference on each frame, inference is performed on every fourth frame, the total number of frames in the streams being handled exceeds 2200 per second.