Home > Servers > PowerEdge Components > White Papers > Developing and Deploying Vision AI with Dell and NVIDIA Metropolis > Training the RetinaNet or ResNet18 model with TAO Toolkit and deploying it with DeepStream and Triton
As we described before, we retrain the model with images from our target KITTI training set inside the RetinaNet Jupyter Notebook retinanet.ipynb using the TAO tools. The TAO models, both unpruned and pruned, are converted from the .etlt format to TensorRT inference engines by the TAO converter. A separate engine is created for each of the 32-bit floating point (FP32), 16-bit floating point (FP16), and 8-bit integer (INT8) precisions.
For deployment, we used the NVIDIA reference application DeepStream-app to run the object detection inference task and the DeepStream-Triton container image from the NVIDIA NGC catalog on the PowerEdge Server. The progress from development to deployment includes five steps shown in the following figure.
This workflow breaks down into substeps as detailed below:
The training process for RetinaNet requires as inputs the training config files, the pre-trained weights, datasets, and retinanet_labels.txt. Below are the main subtasks for running the training for each epoch:
The TAO converter enables the model to be converted into a device-specific, high-efficiency C++/ CUDA TensorRT inference engine implementation (at one of the three precisions 32-bit Floating Point, 16-bit Floating Point, or 8-bit Integer). These TensorRT engines are suitable to run within the DeepStream - Triton framework.
The model repository is a configurable root directory where all the models reside. All the plug-in instances in a single process must share this same model root. It includes the folder with the model, the labels.txt file, and the Triton configuration file config.txt, which provides required and optional information about the model.
DeepStream configuration files are in the samples folder. A useful tool is NVIDIA Polygraphy which is used to introspect the model and determine its input and output tensor shapes. This information is required as part of the configuration step. Below is the description of these files:
The device specific optimized TensorRT is now ready to run within the DeepStream-Triton framework.