IT administrators who want to run mixed workloads including neural network training, inference, or model development using VMware vSphere with NVIDIA GPUs on PowerEdge servers will find the information in this design guide beneficial. The following figure shows how PowerEdge servers, each with two NVIDIA GPUs and a ConnectX network adapter, can be implemented as part of a VMware vSphere cluster to support virtual GPUs for both VMs and containers:
Servers running VMware vSAN provide a storage repository for the VM and operating system requirements while the PowerScale resources provide data lake storage for analytical workloads.
Using the MIG capability of the NVIDIA A100 GPUs, administrators can create vGPU profiles and assign them to the VMs. These VMs can run various workloads such as TensorFlow for training, Jupyter notebooks for interactive model development, and TensorRT for inference.