Validation

To ensure the optimal combination of end-user experience (EUE) and cost-per-user, performance analysis and characterization (PAAC) on Dell VDI solutions is carried out using a carefully designed, holistic methodology that monitors both hardware resource utilization parameters and EUE during load-testing.

Two equivalent APEX instances were tested on this validation effort:

The APEX Private Cloud “Memory optimized” 8GB memory to 1 CPU core configuration was used with VMware Horizon instant clone VDI desktops for one test scenario, outlined below, to validate the solutions performance. This configuration was tested using the Login VSI VDI load testing tool.
The APEX Private Cloud “VDI optimized” 32GB memory to 1 CPU core with GPU accelerated graphics. This configuration had vGPU enabled VDI desktops with a 1GB framebuffer for its test scenario. This configuration was tested using NVIDIA’s nVector testing tool.

A three-host VxRail cluster and 64 CPU cores per host provided a total of 192 instances for the test scenario. One host was used for the graphics testing, which provided 64 instances available for testing.

Login VSI performance testing process and monitoring

Each test scenario was repeated 4 times:

A pilot run to validate that the infrastructure was performing correctly and that valid data could be captured.
Three subsequent runs to enable data correlation.

During testing, while the environment was under load, we logged in to a session and completed tasks that correspond to the user workload. This test is subjective, but it provides a better understanding of the EUE in the desktop sessions, particularly under high load. It also helps to ensure reliable data gathering.

To ensure that the user experience was not compromised, the Dell VDI team monitored the following important resources:

Compute host servers—Solutions based on VMware vCenter for VMware vSphere gather key data (CPU, memory, disk, and network usage) from each of the compute hosts during each test run. This data is exported to .csv files for single hosts and then consolidated to show data from all hosts. While the report does not include specific performance metrics for the management host servers, these servers are monitored during testing to ensure that they are performing at an expected level with no bottlenecks.

Hardware resources—Resource overutilization can cause poor EUE. We monitored the relevant resource utilization parameters and compared them to relatively conservative thresholds. These thresholds are shown in the following table. They were selected based on industry best practices and our experience to provide an optimal trade-off between good EUE and cost-per-user while also allowing sufficient burst capacity for seasonal or intermittent spikes in demand. The following table shows the thresholds that the Dell VDI team set for our testing:

Table 1. Resource utilization parameters
Parameter	Pass/fail threshold
Physical host CPU utilization	85%1
Physical host memory utilization	85%
Network throughput	85%
Disk latency	20 milliseconds
Login VSI failed sessions	2%

Load generation

Login VSI installs a standard collection of desktop application software, including Microsoft Office and Adobe Acrobat Reader, on each VDI desktop testing instance. It then uses a configurable launcher system to connect a specified number of simulated users to available desktops within the environment. When the simulated user is connected, a login script configures the user environment and starts a defined workload. Each launcher system can launch connections to several VDI desktops (target machines). A centralized management console configures and manages the launchers and the Login VSI environment.

We used the following login and boot conditions:

Users were logged in within a login timeframe of 1 hour.
All desktops were started before users were logged in.

NVIDIA nVector

For NVIDIA nVector, the endpoints and desktops are deployed and monitored using an nVector management VM where the framework is run from, data is collected during the test, and analyzed afterwards.  Additionally, the following login and boot paradigm is used: 

Data collection interval for non-VSAN datastores is 1 minute while for VSAN metrics the data collection interval is 5 minutes.
User logon and workload are two separate phases which are staggered to start every 5 seconds.
All desktops are pre-booted in advance of logins commencing.
Data collection is a combination of the automated nVector management framework and manual scripts.

Login VSI workloads

The following table describes the Login VSI workloads that the Dell VDI team tested:


Login VSI workload name	Workload description
Knowledge Worker	Designed for virtual machines with 2 vCPUs. This workload includes the following activities: Microsoft Outlook—Browse messages. Internet Explorer—Browse websites and open a YouTube style video (480p movie trailer) three times in every loop. Microsoft Word—Start one instance to measure response time and another to review and edit a document. Doro PDF Printer and Acrobat Reader—Print a Word document and export it to PDF. Microsoft Excel—Open a large, randomized sheet. Microsoft PowerPoint—Review and edit a presentation. FreeMind—Run a Java-based Mind Mapping application. Other—Perform various copy and zip actions.
nVector Knowledge Worker	1-5 apps, 480p video Manipulating Excel files Scrolling PDFs Opening and typing in Word documents Opening and presenting a PowerPoint presentation Opening and viewing Google Chrome web pages and web videos Opening and closing applications and saving or copying content

Desktop VM test configurations

The following table summarizes the desktop VM configurations used for the Login VSI workload that the Dell VDI team tested. While this desktop configuration is appropriate for the Login VSI workload, evolving application and operating system workloads are creating increased resource requirements, with configurations of up to 4 vCPUs and 8 GB RAM becoming increasingly common for knowledge workers.

Table 2. LoginVSI configuration
Workload	vCPUs	RAM	RAM reserved	Desktop video resolution
Login VSI Knowledge Worker	2	4 GB	2 GB	1920 x 1080

The following table summarizes the desktop VM configurations used for the NVIDIA nVector workload that the Dell VDI team tested:

Table 3. NVIDIA nVector configuration
Workload	vCPUs	RAM	RAM reserved	Desktop video resolution	vGPU profile
nVector Knowledge Worker	2	4 GB	4 GB	1920 x 1080	1B

Summary of test results

The following table summarizes the host utilization metrics for the Login VSI workload that we tested, and the user density derived from the performance testing:

Table 4. Login VSI Knowledge Worker workload
Instance type	Operating system	User density per host	Users per instance	Average CPU	Average active memory	Average IOPS per user	Average network Mbps per user
Memory Optimized	Win 10 22H2	170	2.65	85.4%	212 GB	7.5	6.16 Mbps

The following tables summarize the host utilization metrics for the nVector workload that we tested, and the user density derived from the performance testing:

Table 5. NVIDIA nVector Knowledge Worker workload
Instance type	Operating system	User density per host	Users per instance	Average CPU	Average active memory	Average IOPS per user	Average network Mbps per user
VDI Optimized	Win 10 22H2	128	2	88%	384 GB	15.4	3.47 Mbps

Table 6. NVIDIA nVector Knowledge Worker workload (continued)
Average GPU	Frames per second	Image quality	End-user latency (ms)
25.75%	23	97%	152

The host utilization metrics shown in the preceding table are defined as follows:

User density—The number of users per compute host that successfully completed the workload test within the acceptable resource limits for the host. For clusters, this number reflects the average of the density achieved for all compute hosts in the cluster.
Users per instance—The number of users per instance for the Memory optimized or VDI optimized configs. Directly related to the memory to CPU core ratio.
Average CPU—The average CPU usage over the steady state period. For clusters, this number represents the combined average CPU usage of all compute hosts. On the latest Intel processors, the ESXi host CPU metrics exceed the rated 100 percent for the host if Turbo Boost is enabled, which is the default setting. An additional 35 percent of CPU is available from the Turbo Boost feature, but this additional CPU headroom is not reflected in the VMware vSphere metrics where the performance data is gathered.
Average active memory—For ESXi hosts, the amount of memory that is actively used, as estimated by the VMKernel based on recently touched memory pages. For clusters, this is the average amount of physical guest memory that is actively used across all compute hosts over the steady state period.
Average IOPS per user—IOPS calculated from the average cluster disk IOPS over the steady state period divided by the number of users.
Average network usage per user—Average network usage on all hosts calculated over the steady state period divided by the number of users.
End-user latency—Measures how remote the session feels or how interactive the sessions is (the amount of lag).
Frame rate—Measures the number of frames sent to the end-user.
Image quality—Measures how much the image was impacted and manipulated by the remote protocol (VMware Blast). The SSIM metric is the structural similarity of screenshots taken from the VDI desktop and the endpoint (thin client).
Average GPU—The combined average GPU usage of all installed GPUs over the test period.

1 The Dell Validated Design for VDI team recommends that average CPU utilization not exceed 85 percent in a production environment. A 5 percent margin of error was allocated for this validation work. Therefore, CPU utilization sometimes exceeds our recommended percentage. Because of the nature of Login VSI testing, these exceptions are reasonable for determining our sizing guidance.