The Dell Technologies HPC & AI Innovation Lab engineering team deployed 25 GbE PowerSwitch network switches for this validated design. This design is suited for neural network training jobs that can run on a single node (using at most two GPUs), and for model development and inference jobs that take advantage of GPU partitioning.
This design option uses an S5248-ON switch as the top-of-rack (ToR) switch for management, vSphere vMotion, and VM traffic. You can also use your existing 10 Gb or 25 Gb Ethernet network infrastructure instead of the S5248-ON switch.
The following figure shows the network topology for a 25 GbE Ethernet design with PowerSwitch networking:
The preceding figure shows network connectivity for one PowerEdge server only. The other PowerEdge servers in the vSphere cluster have similar connectivity. Two redundant S5248-ON switches are used as the ToR switches providing 25 Gb Ethernet connectivity. A ConnectX 25 GbE dual-port network adapter in the PowerEdge server provides connectivity to the ToR switches.
An N3248TE-ON switch provides 1 Gb Ethernet for OOB connectivity. Each PowerEdge server's iDRAC is connected to this switch.
vCenter Server and PowerScale storage have connectivity to the ToR switch.
The following table describes the networks that are configured as part of the validated design:
Used by ESXi for host management.
Used by ESXi for vMotion.
Used by ESXi for vSAN traffic.
Supervisor Cluster Management
Management network for Supervisor Cluster control plane VMs.
NSX Advanced Load Balancer (Avi) Management
The Management Network is where the Avi Controller, also called the Controller, resides. The Management Network provides the Controller with connectivity to the vCenter Server, ESXi hosts, and the Supervisor Cluster control plane nodes.
NSX Advanced Load Balancer (Avi) DataNetwork
The data interface of the Avi Service Engines, also called Service Engines, connect to this network. The load balancer Virtual IPs (VIPs) are assigned from this network.
Primary Workload Management
Additional management network for Supervisor Cluster control plane VMs.
Workload Domain Network
Handles the traffic for the Tanzu Kubernetes cluster control plane VMs and workload traffic.