The new MIG feature was designed to support robust hardware partitioning for the latest NVIDIA A100 and A30 GPUs. NVIDIA MIG-enabled GPUs plus NVIDIA vGPU software allows enterprises to leverage the management, monitoring, and operational benefits of VMware virtualization for all resources including AI acceleration.
VMware VMs using MIG-enabled vGPUs provide the flexibility to have a heterogenous mix of GPU partition sizes available on a single host or cluster. MIG-partitioned vGPU instances are fully isolated with an exclusive allocation of high-bandwidth memory, cache, and compute. The A100 PCIe card supports MIG configurations with up to seven GPU partitions per GPU card, while the A30 GPU supports up to four GPU instances per card.
MIG allows multiple vGPU-powered VMs to run in parallel on a single A100 or A30 GPU. One common use case is for administrators to partition available GPUs into multiple units for allocation to individual data scientists. Each data scientist can be assured of predictable performance due to the isolation and Quality of Service guarantees of the vGPU partitioning technology.
The following table lists the options for MIG-supported GPU partitions:
GPU profile name
Profile name on VMs
Fraction of GPU memory
Fraction of GPU computes
Maximum number of instances available
You can create and assign a combination of the preceding profiles to VMs. Only certain combinations of MIG partitions are supported for a single card instance. For more information about MIG and the supported partition details, see the NVIDIA Multi-Instance GPU User Guide.
When a VM is created, an administrator can assign one of the preceding partitions before powering on the VM. That GPU resource is then exclusively allocated to a single VM, guaranteeing isolation of GPU resources. MIG only supports assigning one partition profile type per VM. The GPU resources are deallocated whenever the VM is powered off or migrated to another server.
The following figure shows an example of how MIG partitions are allocated to VMs:
The example shows an A100 GPU partitioned into three compatible MIG profiles: MIG 4g.20gb, MIG 2g.10gb, and MIG 1g.5gb. These profiles are typically assigned to run light training using TensorFlow, model development using Jupyter notebooks, and inference using TensorRT.