GPU Acceleration for Dell Azure Stack HCI: Consistent and Performant AI/ML Workloads

The end of 2022 brought us excellent news: Dell Integrated System for Azure Stack HCI introduced full support for GPU factory install.

As a reminder, Dell Integrated System for Microsoft Azure Stack HCI is a fully integrated HCI system for hybrid cloud environments that delivers a modern, cloud-like operational experience on-premises. It is intelligently and deliberately configured with a wide range of hardware and software component options (AX nodes) to meet the requirements of nearly any use case, from the smallest remote or branch office to the most demanding business workloads.

With the introduction of GPU-capable AX nodes, now we can also support more complex and demanding AI/ML workloads.

New GPU hardware options

Not all AX nodes support GPUs. As you can see in the table below, AX-750, AX-650, and AX-7525 nodes running AS HCI 21H2 or later are the only AX node platforms to support GPU adapters.

Table 1: Intelligently designed AX node portfolio

Note: AX-640, AX-740xd, and AX-6515 platforms do not support GPUs.

The next obvious question is what GPU type and number of adapters are supported by each platform.

We have selected the following two NVIDIA adapters to start with:

NVIDIA Ampere A2, PCIe, 60W, 16GB GDDR6, Passive, Single Wide
NVIDIA Ampere A30, PCIe, 165W, 24GB HBM2, Passive, Double Wide

The following table details how many GPU adapter cards of each type are allowed in each AX node:

Table 2: AX node support for GPU adapter cards

	AX-750	AX-650	AX-7525
NVIDIA A2	Up to 2	Up to 2	Up to 3
NVIDIA A30	Up to 2	--	Up to 3
Maximum GPU number (must be same model)	2	2	3

Use cases

The NVIDIA A2 is the entry-level option for any server to get basic AI capabilities. It delivers versatile inferencing acceleration for deep learning, graphics, and video processing in a low-profile, low-consumption PCIe Gen 4 card.

The A2 is the perfect candidate for light AI capability demanding workloads in the data center. It especially shines in edge environments, due to the excellent balance among form factor, performance, and power consumption, which results in lower costs.

The NVIDIA A30 is a more powerful mainstream option for the data center, typically covering scenarios that require more demanding accelerated AI performance and a broad variety of workloads:

AI inference at scale
Deep learning training
High-performance computing (HPC) applications
High-performance data analytics

Options for GPU virtualization

There are two GPU virtualization technologies in Azure Stack HCI: Discrete Device Assignment (also known as GPU pass-through) and GPU partitioning.

Discrete Device Assignment (DDA)

DDA support for Dell Integrated System for Azure Stack HCI was introduced with Azure Stack HCI OS 21H2. When leveraging DDA, GPUs are basically dedicated (no sharing), and DDA passes an entire PCIe device into a VM to provide high-performance access to the device while being able to utilize the device native drivers. The following figure shows how DDA directly reassigns the whole GPU from the host to the VM:

Figure 1: Discrete Device Assignment in action

To learn more about how to use and configure GPUs with clustered VMs with Azure Stack HCI OS 21H2, you can check Microsoft Learn and the Dell Info Hub.

GPU partitioning (GPU-P)

GPU partitioning allows you to share a physical GPU device among several VMs. By leveraging single root I/O virtualization (SR-IOV), GPU-P provides VMs with a dedicated and isolated fractional part of the physical GPU. The following figure explains this more visually:

Figure 2: GPU partitioning virtualizing 2 physical GPUs into 4 virtual vGPUs

The obvious advantage of GPU-P is that it enables enterprise-wide utilization of highly valuable and limited GPU resources.

Note these important considerations for using GPU-P:

Azure Stack HCI OS 22H2 or later is required.
Host and guest VM drivers for GPU are needed (requires a separate license from NVIDIA).
Not all GPUs support GPU-P; currently Dell only supports A2 (A16 coming soon).
We strongly recommend using Windows Admin Center for GPU-P to avoid mistakes.

You’re probably wondering about Azure Virtual Desktop on Azure Stack HCI (still in preview) and GPU-P. We have a Dell Validated Design today and will be refreshing it to include GPU-P during this calendar year.

To learn more about how to use and configure GPU-P with clustered VMs with Azure Stack HCI OS 22H2, you can check Microsoft Learn and the Dell Info Hub (Dell documentation coming soon).

Timeline

As of today, Dell Integrated System for Microsoft Azure Stack HCI only provides support for Azure Stack HCI OS 21H2 and DDA.

Full support for Azure Stack HCI OS 22H2 and GPU-P is around the corner, by the end of the first quarter, 2023.

Conclusion

The wait is finally over, we can now leverage in our Azure Stack HCI environments the required GPU power for AI/ML highly demanding workloads.

Today, DDA provides fully dedicated GPU pass-through utilization, whereas with GPU-P we will very soon have the choice of providing a more granular GPU consumption model.

Thanks for reading, and stay tuned for the ever-expanding list of validated GPUs that will unlock and enhance even more use cases and workloads!

Author: Ignacio Borrero, Senior Principal Engineer, Technical Marketing Dell CI & HCI

@virtualpeli

Your Browser is Out of Date