Customers can run AI workloads on VMware vSphere and access virtualized NVIDIA GPUs using two scenarios:
- Running AI workloads as virtual machines (VMs)—Partitioned, full, or multiple GPUs are allocated to VMs. Required software and drives are installed in the VMs for workloads to take advantage of accelerated hardware.
- Running AI workloads as Kubernetes pods in Tanzu Kubernetes cluster (TKC)—Worker node templates incorporate GPU resources. Worker nodes can be provisioned dynamically. NVIDIA operators automatically configure the worker nodes for workloads to take advantage of the accelerated hardware. Also, Tanzu offers a rich ecosystem of software products to manage, monitor, and operate the Kubernetes environment.
The rest of this section describes how GPUs are virtualized and made available to both virtual VMs and containers.