VMware ESXi offers two options for using graphics processing units (GPUs) with clustered virtual machines (VMs) running on an HCI cluster to provide GPU acceleration to workloads:
- GPU PCIe pass-through device mode
- Virtual GPU Partitioning
In GPU pass-through mode, an entire physical GPU is directly assigned to one VM, bypassing any host drivers. In this mode of operation, the GPU is accessed exclusively by the GPU driver running in the VM to which it is assigned. The GPU cannot be shared among VMs. GPU partitioning allows you to share a physical GPU device with multiple virtual machines (VMs). With GPU partitioning or GPU virtualization, each VM gets a dedicated fraction of the GPU instead of the entire GPU. The NVIDIA L40 GPUs installed in our test cluster support passthrough mode and GPU Partitioning configurations. GPU partitioning has the advantage of supporting live migration of VMs between hosts. VMs using GPUs in passthrough mode must be manually placed where host GPU resources are available and restarted during cluster maintenance or in the event of a failure.
We chose to configure the allocation of GPUs to VMs using GPU partitioning. We used NVIDIA vGPU software, a graphics virtualization platform that provides virtual machines (VMs) access to NVIDIA GPU technology. The releases of the NVIDIA vGPU Manager and guest VM drivers that you install must be compatible. Installing an incompatible guest VM driver release for the release of the vGPU Manager results in the NVIDIA vGPU's failure to load. The following NVIDIA vGPU Manager and guest VM driver release combinations are compatible. - NVIDIA vGPU Manager with guest VM drivers from the same release - NVIDIA vGPU Manager with guest VM drivers from different releases within the same major release branch - NVIDIA vGPU Manager from a later major release branch with guest VM drivers from the previous branch You must also use a compatible version of the NVIDIA License System with the chosen release of NVIDIA vGPU software.