This white paper shows the high throughput achieved for object detection when a Dell PowerEdge R6515 Server and NVIDIA A2 GPUs work together to process multiple video streams. It follows a previous paper, Developing and Deploying Vision AI with Dell and NVIDIA Metropolis (delltechnologies.com), that describes in detail the development of the same RetinaNet object detection inference model but with a T4 GPU instead of the two A2 GPUs that are used for the results presented here.
Deploying vision AI applications can provide businesses with insights and intelligence that can be transformational. NVIDIA A2 GPUs running on a Dell PowerEdge R6515 server provides high object detection inference throughput which can be used as a part of such vision AI solutions. In this paper we again use the power of transfer learning from the NVIDIA TAO Toolkit to fine-tune a RetinaNet engine and deploy it for inference using DeepStream, an end-to-end intelligent video analytics pipeline. The optimized engine for the A2 produces more than 200 frames per second (fps) of object detection inference on the test images (1280 x 720) that we target from the KITTI dataset. This total fps throughput can be applied against multiple video streams using NVIDIA’s Triton Inference software to orchestrate the flows within the DeepStream pipeline. Additionally, if instead of performing inference on each frame, inference is performed on every fourth frame, the total number of frames in the streams being handled exceeds 700 per second.