The testing was performed on a 4-node VxRail cluster. We used the Citrix MCS linked-clone provisioning method to provision pooled-random desktop VMs. The Citrix ThinWire Plus was used as the remote display protocol.
The following table summarizes the host utilization metrics for the different Login VSI workloads that we tested, and the user density derived from Login VSI performance testing. The CPU was the bottleneck in all of the test cases. In all of the tests we performed, the CPU utilization metric reached the 85 percent threshold (with a +5 percent margin) that we set for CPU utilization.
Server configuration | Login VSI workload | User density | Average CPU 1 | Average consumed memory | Average active memory | Average IOPS per user | Average network Mbps per user |
Density Optimized | Knowledge Worker | 131 | 85% | 578 GB | 167 GB | 5.22 | 1.4 Mbps |
Density Optimized | Power Worker | 110 | 86% | 704 GB | 155 GB | 5.20 | 1.8 Mbps |
Density Optimized | Task Worker | 200 | 85% | 297 GB | 88 GB | 1.19 | 1.2 Mbps |
Density Optimized + (6 x NVIDIA T4) | Multimedia Worker (NVIDIA T4-2B) | 48 | 92% 2 | 449 GB | 385 GB | 12.73 | 17.06 Mbps |
These threshold values, as shown in Table 5, are carefully selected to deliver an optimal combination of excellent EUE and low cost per user while also providing burst capacity for seasonal or intermittent spikes in usage. We did not load the system beyond these thresholds to reach Login VSImax. Login VSImax shows the number of sessions that can be active on a system before the system is saturated.
Memory was not a constraint during testing. The total memory of 768 GB was sufficient for all of the Login VSI workloads to run without any constraints. With a dual-port 25 GbE NIC available on the hosts, network bandwidth was also not an issue. Disk latency was also under the threshold that we set, and disk performance was good.
For the Multimedia Worker workload test, we relaxed the 85 percent threshold to test with 48 users per VxRail node, which is the maximum number of users that can be hosted on a VxRail node configured with six NVIDIA T4 GPUs and NVIDIA T4-2B vGPU profiles (2 GB of frame buffer per user). The total available frame buffer on the host with six NVIDIA T4 GPUs configured is 96 GB. The CPU utilization recorded was 92 percent. However, the Login VSI scores and host metric results indicate that both user experience and performance were good during the running of this graphics-intensive workload.
We have recommended user densities based on the Login VSI test results and considering the thresholds that we set for host utilization parameters. To maintain good EUE, do not exceed these thresholds. You can load more user sessions and exceed these thresholds, but you might experience a degradation in user experience.
The host utilization metrics mentioned in the table are defined as follows:
We performed this test with the Login VSI Knowledge Worker workload. We used a 4-node VxRail cluster. Host 3 hosted both management and desktop VMs. We populated the compute hosts with 131 desktop VMs machines each and the management host with 129 desktop VMs. We created Pooled-random desktops with the Citrix MCS linked-cloned provisioning method. We used Citrix ThinWire Plus as the remote display protocol.
CPU usage with all VMs powered on was approximately 12 percent before the test started. The CPU usage steadily increased during the login phase, as shown in the following figure.
During the steady state phase, we recorded an average CPU utilization of 85 percent. This value is close to the pass/fail threshold we set for average CPU utilization (see Table 5). To maintain good EUE, do not exceed this threshold. You can load more user sessions while exceeding this threshold for CPU but you might experience a degradation in user experience.
As shown in the following figure, the CPU readiness was well below the 10 percent threshold that we set. CPU readiness is defined as the percentage of time that the virtual machine was ready, but could not get scheduled to run on the physical CPU. The CPU readiness percentage was low throughout testing, indicating that the VMs had no significant delays in scheduling CPU time.
The average steady-state CPU core utilization across the four hosts was 74 percent, as shown in the following figure.
We observed no memory constraints during the testing on either the management or compute hosts. Out of 768 GB of available memory per node, the compute host reached a maximum consumed memory of 594 GB and a steady state average of 578 GB.
Active memory usage reached a maximum active memory of 199 GB and a steady state average memory of 167 GB. There was no memory ballooning or swapping on the hosts.
Network bandwidth was not an issue during the testing. The network usage, as shown in the following figure, reached a steady state average of 753 Mbps. The busiest period for network traffic was during the steady state phase when a peak value of 926 Mbps was recorded. The average steady state network usage per user was 1.4 Mbps.
Cluster IOPS reached a maximum value of 436 for read IOPS and 2,635 for write IOPS. The average steady state read and write IOPS were 370 and 2,356, respectively. The average disk IOPS (read and write) per user was 5.22.
Cluster disk latency reached a maximum read latency of 0.41 milliseconds and a maximum write latency of 0.78 milliseconds. The average steady state read latency was 0.4 milliseconds, and the average write latency was 0.75 milliseconds.
The baseline score for the Login VSI test was 952. This score falls in the 800 to 1,199 range rated as "Good" by Login VSI. For more information about Login VSI baseline ratings and baseline calculations, see this Login VSImax article. The Login VSI test was run for 522 user sessions for the Knowledge Worker workload. As indicated by the blue line in the following figure, the system reached a VSImax average score of 1,280 when 510 sessions were loaded. This value is well below the VSI threshold score of 1,953 set by the Login VSI tool.
During the testing, VSImax was never reached, which typically indicates a stable system and a better user experience. See Appendix A, which explains the Login VSI metrics discussed here.
The Login VSIMax user experience score for this test was not reached. During manual interaction with the sessions during the steady state phase, the mouse and window movement were responsive, and video playback was good. No "stuck sessions" were reported during the testing, indicating that the system was not overloaded at any point.
Login VSI baseline | VSI index average | VSIMax reached | VSI threshold |
952 | 1,280 | No | 1,953 |
We performed this test with the Login VSI Power Worker workload. We used a 4-node VxRail cluster. Host 3 was provisioned with both management and desktop VMs. We populated the compute host with 110 desktop VMs and the management host with 105 desktop VMs. We created Pooled-random desktops with the Citrix MCS linked-cloned provisioning method. We used Citrix ThinWire Plus as the remote display protocol.
As shown in the following figure, CPU usage with all VMs powered on was approximately 12 percent before the test started. The CPU usage steadily increased during the login phase. During steady state, an average CPU utilization of 86 percent was recorded. This value is close to the pass/fail threshold we set for average CPU utilization (see Table 5). To maintain good EUE, do not exceed this threshold. You can load more user sessions while exceeding this threshold for CPU, but you might experience a degradation in user experience.
As shown in the following figure, the CPU readiness was well below the 10 percent threshold that we set. The CPU readiness percentage was low throughout testing, indicating that the VMs had no significant delays in scheduling CPU time.
As shown in the following figure, the average steady state CPU core utilization across the four hosts was 77 percent.
We observed no memory constraints during the testing on either the management or compute hosts. Out of 768 GB of available memory per node, the compute host reached a maximum consumed memory of 706 GB and a steady state average of 704 GB, as shown in the following figure.
Active memory usage reached a maximum of 178 GB, and steady state average memory usage was 155 GB, as shown in the following figure. There was no memory ballooning or swapping on the hosts.
Network bandwidth was not an issue in this test. A steady state average of 786 Mbps was recorded during the test. The busiest period for network traffic was towards the end of the logout phase when it reached a maximum network usage of 1,044 Mbps. The average steady state network usage per user was 1.8 Mbps.
As shown in the following figure, the cluster read IOPS reached a maximum value of 391 for read IOPS and 2,269 for write IOPS. The average steady state read and write IOPS were 253 and 2,011, respectively. The average disk IOPS per user during the steady state period was 5.2.
As shown in the following figure, cluster disk latency reached a maximum read latency of 0.44 milliseconds, and a maximum write latency of 0.89 milliseconds during the steady state phase. The average steady state read latency was 0.41 milliseconds, and the average state write latency was 0.85 milliseconds.
The baseline score for the Login VSI test was 889. This score falls in the 800 to 1,199 range rated as "Good" by Login VSI. For more information about Login VSI baseline ratings and baseline calculations, see this Login VSImax article. The Login VSI test was run for 435 user sessions for the Power Worker workload. As indicated by the blue line in the following figure, the system reached a VSImax average score of 1,240 when 435 sessions were loaded. This is well below the VSI threshold score of 1,889 set by the Login VSI tool.
During the duration of testing, VSImax was never reached, which normally indicates a stable system and a better user experience. See Appendix A, which explains the Login VSI metrics discussed here.
During manual interaction with the sessions during the steady state phase, the mouse and window movement were responsive, and video playback was good. There was no "stuck session" reported during the testing, indicating that the system was not overloaded at any time.
Login VSI baseline | VSI index average | VSIMax reached | VSI threshold |
889 | 1,240 | No | 1,889 |
We performed this test with the Login VSI Task Worker workload. We used a 4-node VxRail cluster. Host 3 was provisioned with both management and Remote Desktop Session Host (RDSH) VMs. We populated each host with eight RDSH VMs. Each host ran 200 Task Worker sessions. The RDSH VMs were provisioned using Citrix MCS. We used Citrix ThinWire Plus as the remote display protocol.
As shown in the following figure, CPU usage with all VMs powered on was approximately 1 percent before the test started. The CPU usage steadily increased during the login phase. During the steady state phase, an average CPU utilization of 85 percent was recorded across the hosts. This value is close to the pass/fail threshold we set for average CPU utilization (see Table 5). To maintain good EUE, do not exceed this threshold. You can load more user sessions while exceeding this threshold for CPU, but you might experience a degradation in user experience.
As shown in the following figure, the CPU readiness was well below the 10 percent threshold that we set. The CPU readiness percentage was low throughout testing, indicating that the VMs had no significant delays in scheduling CPU time.
The average steady state CPU core utilization across the four hosts was 69 percent, as shown in the following figure.
We observed no memory constraints during the testing on either the management or compute hosts. Out of 768 GB of available memory per node, the compute host reached a maximum consumed memory of 317 GB and a steady state average of 297 GB, as shown in the following figure.
Active memory usage reached a maximum of 148 GB and a steady state average memory of 88 GB, as shown in the following figure. There was no memory ballooning or swapping on the hosts.
Network bandwidth was not an issue in this test. A steady state average of 956 Mbps was recorded during the test. The busiest period for network traffic was in the steady state phase, recording a maximum network usage of 1,320 Mbps. The average steady state network usage per user was 1.2 Mbps.
Cluster IOPS reached a maximum value of 172 for read IOPS and a maximum value of 1,952 for write IOPS. The average steady state read and write IOPS figures were 80 and 871, respectively. The average disk IOPS (read and write) per user was 1.2.
As shown in the following figure, cluster disk latency reached a maximum read latency of 1.18 milliseconds, and a maximum write latency of 2.74 milliseconds during the steady state phase. The average steady state read latency was 1.14 milliseconds, and the average state write latency was 2.51 milliseconds.
The baseline score for the Login VSI test was 635. This score falls in the 0 to 799 range rated as "Very Good" by Login VSI. For more information about Login VSI baseline ratings and baseline calculations, see this Login VSImax article. The Login VSI test was run for 800 RDSH user sessions for the Task Worker workload. As indicated by the blue line in the following figure, the system reached a VSImax average score of 964 when 800 sessions were loaded. This is well below the VSI threshold score of 1,636 set by the Login VSI tool.
During the duration of testing, VSImax was never reached, which normally indicates a stable system and a better user experience. See Appendix A, which explains the Login VSI metrics discussed here.
During manual interaction with the sessions during the steady state phase, the mouse and window movement were responsive, and video playback was good. There was only three "stuck session" reported during the testing, indicating that the system was not overloaded at any time.
Login VSI baseline | VSI index average | VSIMax reached | VSI threshold |
635 | 964 | No | 1,636 |
We performed this test with the Login VSI Multimedia Worker workload. We configured six NVIDIA T4 GPUs on compute host 1. We provisioned the host with 48 vGPU-enabled VMs and each of the desktop VMs used the NVIDIA T4-2B vGPU profile. The desktop VMs were provisioned using Citrix MCS. We used Citrix ThinWire Plus as the remote display protocol.
As shown in the following figure, CPU usage with all GPU-enabled VMs powered on was approximately 8 percent before the test started. The CPU usage steadily increased during the login phase. During the steady-state phase, an average CPU utilization of 92 percent was recorded on GPU-enabled compute host 1. We relaxed the 85 percent threshold for this testing so that we could test with 48 users. In a production environment, you can decrease the user density or use a higher bin processor to achieve the 85 percent CPU utilization threshold.
As shown in the following figure, CPU readiness was well below the 10 percent threshold that we set. The CPU readiness percentage was low throughout testing, indicating that the VMs had no significant delays in scheduling CPU time.
The average steady-state CPU core utilization across the four hosts was 81 percent, as shown in the following figure:
The following figure shows the GPU usage across the six NVIDIA T4 GPUs that we configured on the host. The GPU usage during the steady-state period across the six GPUs averaged approximately 39 percent. A peak GPU usage of 78 percent was recorded on GPU 4 during the steady-state phase.
We observed no memory constraints during the testing on the GPU-enabled compute host. Out of 768 GB of available memory per node, the GPU-enabled compute host reached a maximum consumed memory of 449 GB and a maximum active memory of 385 GB, as shown in the following figures. No variations in memory usage occurred throughout the test because all vGPU-enabled VM memory was reserved. There was no memory ballooning or swapping on the hosts.
Network bandwidth was not an issue in this test. A steady-state average of 802 Mbps was recorded during the test, as shown in the following figure. The busiest period for network traffic was when a maximum network usage of 1,320 Mbps was recorded during the steady-state phase. The average steady state network usage per user was 16.71 Mbps.
As shown in the following figure, cluster IOPS reached a maximum value of 498 for read IOPS and a maximum value of 537 for write IOPS. The average steady state read and write IOPS figures were 184 and 427, respectively. The average disk IOPS (read and write) per user was 12.73.
As shown in the following figure, cluster disk latency reached a maximum read latency of 0.51 milliseconds and a maximum write latency of 2.84 milliseconds during the steady-state phase. The average steady-state read latency was 0.39 milliseconds, and the average steady-state write latency was 1.34 milliseconds.
The baseline score for the Login VSI test was 1,204. This score falls within the 1,200 to 1,599 range, which Login VSI rates as "Fair." For more information about Login VSI baseline ratings and baseline calculations, see this Login VSImax article. The Login VSI test was run for 48 multimedia user sessions. As indicated by the blue line in the following figure, the system reached a VSImax average score of 1,940 when 48 sessions were loaded. This is well below the VSI threshold score of 2,204 set by the Login VSI tool.
For the duration of testing, VSImax was never reached, which indicates a stable system and a better user experience. For an explanation of the Login VSI metrics, see Appendix A.
During manual interaction with the sessions during the steady-state phase, the mouse and window movement were responsive, and video playback was good. No "stuck sessions" were reported during the testing, indicating that the system was not overloaded at any time. The following table shows the score summary:
Login VSI baseline | VSI index average | VSIMax reached | VSI threshold |
1,204 | 1,940 | No | 2,204 |