The software components for the solution include:
- VMware vSphere 7 supports virtualized NVIDIA GPUs for both VMs and containers through VMware vSphere with Tanzu.
- VMware vSAN 7 is a software-defined storage solution from VMware, built from the ground up for vSphere VMs.
- NVIDIA AI Enterprise is an end-to-end, cloud-native suite of AI and data analytics software that is optimized, certified, and supported by NVIDIA to run exclusively on VMware vSphere with NVIDIA-Certified Systems. It includes the following components for Tanzu support:
- Operators (required for Tanzu Kubernetes cluster)—NVIDIA GPU Operator automates the provisioning of the NVIDIA drivers (to enable CUDA), Kubernetes device plug-in for GPUs, and the NVIDIA Container Runtime on worker nodes. NVIDIA Network Operator configures ConnectX network adapters for GPUDirect RDMA.
- AI and data science frameworks—These frameworks include the following validated and fully supported containers: TensorFlow, PyTorch, NVIDIA RAPIDS, NVIDIA TensorRT, and NVIDIA Triton Inference Server.
- Tanzu ecosystem is required for Tanzu Kubernetes cluster and includes the following components:
- Harbor is a container registry for Tanzu Kubernetes clusters.
- Prometheus and Grafana are open-source software that enables you to gather, visualize, and analyze metrics on Tanzu Kubernetes clusters. Prometheus and Grafana are supported with VMware Tanzu and NVIDIA AI Enterprises.
- VMware NSX Advanced Load Balancer (Avi) with Cloud Services has multicloud load balancing, web application firewall, and container ingress services. It can be used to load balance AI use cases such as Machine Learning Operation applications or inference workloads.
- Tanzu Mission Control is a centralized hub for simplified, multicloud, multicluster Kubernetes management. It provides life cycle management for Kubernetes clusters enabling administrators to provision, scale, upgrade, and delete Tanzu Kubernetes Grid clusters.