Home > Servers > PowerEdge Components > White Papers > Developing and Deploying Vision AI with Dell and NVIDIA Metropolis > NVIDIA Triton Inference Server overview
Triton Inference Server is software that enables deployment of neural and classical machine-learning models in production environments at scale. It also provides the ability to deploy multi-models. Triton simplifies the deployment of commercial inference services and delivers low-latency, real-time inferencing to maximize both GPU and CPU utilization. For more information, see NVIDIA Triton Inference Server.