Test environment and methodology

Test tools

To ensure the optimal combination of end-user experience and cost-per-user, the performance analysis and characterization on Dell Technologies VDI Solutions is carried out using a carefully designed, holistic methodology that monitors both hardware resource utilization parameters and user experience during load testing.

Login VSI is the industry standard tool for testing VDI environments and server-based computing. The tool installs a standard collection of desktop application software (for example, Microsoft Office, Adobe Acrobat Reader, and so on) on each VDI desktop. It then uses launcher systems to connect a specified number of users to available desktops within the environment. After the user is connected, the workload is started by a login script, which starts the test script after the user environment is configured. Each launcher system can launch connections to several target machines (in this case VDI desktops).

For Login VSI, the launchers and virtual session indexer environment are configured and managed by a centralized management console. Additionally, the following login and boot paradigm is used:

Users were logged in within a login timeframe of 1 hour. An exception to this login timeframe occurred when testing low-density solutions such as GPU/graphics-based configurations. With those configurations, users were logged in every 5 seconds.
All desktops were pre-booted before logins commenced.
The data collection interval for vSAN metrics was 5 minutes.

For the AI testing, AI training and AI model validation took place on the same VM. After training took place, the models obtained were then validated with the same VM hardware configuration used in training.

The methodology that we used for training and model validation was similar to the method that was used in the AI-assisted Radiology Using Distributed Deep Learning on Apache Spark and Analytics Zoo White Paper. Model weights were collected for 15 Epochs during training, with each subsequently validated with the same techniques as in the referenced paper. For example, the Average AUC-ROC 1 accuracy was calculated for all disease categories contained in the dataset. The model with the highest-performing AUC-ROC accuracy can be used for inferencing on real-world data. We validated against the entire dataset used for training and not on a withheld or holdout portion of the dataset, which is a common validation technique.

Resource monitoring

The team used several methods for resource and component monitoring during performance testing and in the deployed solutions where applicable.

We used VMware vCenter for VMware vSphere-based solutions to gather key data (CPU, GPU, memory, disk, and network usage) from each of the compute hosts during each test run. We exported this data to .csv files for single hosts and then consolidated it to show data from all hosts (when multiple hosts were tested). While the report does not include specific performance metrics for the Management host servers, we monitored these servers during testing to ensure that they were performing at an expected performance level with no bottlenecks.

We gathered GPU performance metrics directly from the vSphere client, either manually or using a script.

For resource utilization, we determined the user density at a reasonable system load. Testing to system failure was out of scope. To achieve a reasonable system load, we set target thresholds for system resources, as described in the following table. These thresholds reflect a system that is well utilized but not near failure.

Table 3. Resource utilization thresholds
Metrics	Target threshold
Average CPU usage	85%
Average CPU core utilization	85%
Average CPU readiness	10%
Average memory utilization (active)	85%
Consumed memory	<100%
Memory ballooning	None
Memory swapping	None
Network throughput	85%
Storage latency	20 milliseconds (ms)
Spare storage capacity	15%

Test configuration

The following table describes the hardware and software components of the infrastructure that we used for the performance analysis and characterization tests:

Table 4. Validated hardware configurations
Category	Platform	CPU	Memory	Network
Compute host hardware	4 x VxRail V570F (Density Optimized)	2 x 6248 at 2.5 GHz, 20-core processors	768 GB memory at 2933 MT/s 64 GB x 12 DDR-4	Broadcom Adv. Dual 25 Gb Ethernet
Management host hardware	4 x VxRail E460F (Management)	2 x Intel Xeon CPU E5-2698 v4 at 2.20 GHz	512 GB Memory at 2400 DDR-4 (32 GB x 16)	Intel 2P X520/2P I350 rNDC

The following table describes the hardware and software components that we used:

Table 5. Hardware and software components
Category	Component
NVIDIA GPU	3 NVIDIA RTX 8000 GPU cards installed in one host
Storage	Compute vSAN VxRail HBA330-Adp BOSS S1 (2 x 240 GB SATA M.2 SSD) 2 x 800 GB WI SSDs + 2 x 1.92 TB RI SSDs per host 2 vSAN disk groups Management HBA330 Mini 2 x 400 GB WI SSDs, 2 x 1.92 TB RI SSDs
Network	PowerSwitch S5248-ON switch
Patches	All Spectre/Meltdown/L1TF patches were applied to the parent image and ESXi hosts as required.
Protocol	BLAST Extreme H.264 + Switch Codec
Broker	VMware Horizon 8.0.0 build - 16592062
Hypervisor	VMware vSphere ESXi, 7.0.0, 15843807 VMware vCenter 7.0.0 15952498
SQL	Microsoft SQL Server 2019
vGPU software version	NVIDIA vGPU 11
Desktop operating system	Microsoft Windows 10 Enterprise 64-Bit, 1909 version
Microsoft Office version	Office 2019
Management operating system	Microsoft Window Server 2019
Login VSI version	Login VSI 4.1.40.1
Antivirus software	Windows Defender

AI training and validation configuration, dataset, and model

The following tables show the AI VM and operating system configurations and the dataset that we chose for our AI training and validation. Other configurations of the AI VM are possible that may improve performance. However, we didn't extensively explore these.

Table 6. AI VM configurations
Configuration	CPU	Memory	Disk	GPU PCI 0*	GPU PCI 1*	GPU PCI 2*
AI VM	20 vCPUs, 1 core per socket, 20 sockets	256 GB reserved	560 GB HD	Rtx8000 Grid_rtx8000p-48q	Rtx8000 Grid_rtx8000p-48q	Rtx8000 Grid_rtx8000p-48q

*In the testing scenarios that follow, we performed testing with one, two, or all of the GPUs assigned to the AI VM.

Table 7. Operating system configurations
Operating system	Version	Kernel	NVIDIA driver	NVIDIA-SMI CUDA	CUDA
Ubuntu	18.04.4 LTS	5.4.0-47-generic	440.87	10.2	11.0.182

Table 8. Dataset used for training and validation
Data set	Images	Description
ChestXRay14	112,120 png images	Chest x-ray dataset, containing over 100,000 frontal view x-ray images with 14 labeled diseases categories.

Table 9. Deep-learning convolution neural network (DLCNN) model topology
Neural network topology	Description
DenseNet	DenseNet alleviates the vanishing-gradient problem, strengthens feature propagation, encourages feature reuse, and substantially reduces the number of parameters.

Graphics VM configuration

The following table shows the configuration of the graphics VM:

Table 10. Graphics VM configuration
Configuration	CPU	Memory	Disk	GPU PCI 0
Graphics VM	4 vCPUs	8 GB reserved	60 GB HD	Rtx8000 Grid_rtx8000p-2q

Medical dataset used for graphics

The following table shows the medical dataset that we used for the VDI graphics tests:

Table 11. Medical dataset used
Dataset	Images	Description
DICOM Library for abdomen	361 DICOM images	CT anonymized images of abdomen dataset, containing 361 DICOM images

1 Area under the curve (AUC) Receiver operating characteristics (ROC)

Your Browser is Out of Date