VDI Data Protection - Part 2 : A Deep Dive into VMware Horizon 7 Multi-Site Scenarios
Fri, 03 Apr 2020 14:41:22 -0000|
Read Time: 0 minutes
In the first part of this blog series – VDI Data Protection - Part 1: Protecting Your VDI Environment - What You Need to Consider, we discussed major components of virtual desktop infrastructure (VDI) data protection and the parameters involved in selecting a data protection plan that is right for your VDI environment. As discussed in that blog post, disaster recovery and operational backup are two significant aspects of data protection.
A VDI outage
A complete risk assessment of the likely disaster scenarios and business impact should be done to formulate an optimal DR plan for your VDI environment. Disaster recovery protection and execution comes with a cost, so you should strike the right balance between cost and the availability of VDI services required by the organization. You should:
- Divide the VDI user groups into segments based on the criticality of the applications they use.
- Establish an optimal Recovery Point Objective (RPO) and Recovery Time Objective (RTO) for each desktop group so that a disaster will have the least impact on your business while also conforming with the organization’s budget. (RTO is the elapsed time until virtual desktops are available or recovered after an incident. RPO is the acceptable time duration (in minutes or hours) of data loss from a VDI environment in the event of a disaster.)
In this blog, we will deep dive into the different multi-site approaches used in a VMware Horizon 7 VDI Disaster Recovery plan.
Multi-Site Horizon 7 VDI with Cloud Pod Architecture
Cloud Pod Architecture (CPA) is the foundation for VMware’s Horizon 7 VDI Disaster Recovery. A pod is made up of a group of interconnected Connection Servers that broker connections to desktops or published applications. A pod consists of multiple blocks, and a block is a collection of one or more vSphere clusters hosting pools of desktops or applications. Each block has a dedicated vCenter Server.
With CPA, you can have multiple pods connected in a federation to improve reliability. For example, you can have pods in each of your CPA sites, and all of them can be connected via the CPA to form a federation. In a CPA environment, a site is a collection of well-connected pods in the same physical location, typically in a single data center. Connection Server instances in a pod federation use a data-sharing approach known as the Global Data Layer to replicate information about pod federation topology, user/group assignments, policies, and other CPA configuration settings.
With a global entitlement, you can publish desktop icons from Horizon pools from any of the pods in the federation. You can create a global entitlement to publish the VDI desktop icon, then assign users or user groups to the global entitlement, which gives users access to desktops across multiple sites. Each site has a minimum of one pod.
Figure 1 shows a basic CPA architecture involving two Horizon 7 sites or data centers. For more information regarding the architecture and configuration of Horizon 7 CPA, see the VMware documentation.
Figure 1: VMware Horizon 7 Multi-Site Deployment with Cloud Pod Architecture
When a user launches a VDI desktop icon, the following occurs:
- The request goes to a Global Server Load Balancer (for example, VMware NSX Advanced Load Balancer).
- The Global Server Load Balancer redirects the request to a Horizon Connection Server in one of the pods.
- The Horizon Connection Server brokers the virtual desktops.
Home Site and Scope policies defined in a global entitlement determine the scope of the search to identify a desktop for the user. A Home Site is a relationship between a user or group and a CPA site. Typically, a user’s data and profile reside in the site that is configured as a Home Site. With the Home Site option enabled, the preference will be to launch the desktop from the Home Site, irrespective of the user’s location. If the Home Site is not configured, unavailable, or does not have resources to satisfy the user's request, Horizon continues searching other sites according to the Scope policy set for the global entitlement. The Scope policy options available in Horizon 7 global entitlement, are ‘All Sites,’ ‘Within Site,’ or ‘Within Pod.’
Horizon CPA architecture is typically used with non-persistent desktop models based on instant clones or linked clones, where we can de-couple the user data from the rest of the desktop. A similar desktop virtual machine (VM), without user data, is also provisioned in other sites. The user data can then be replicated to the other sites to provide a consistent experience while accessing these desktops.
You can have an active/active or active/passive approach for your VDI DR plan based on Horizon CPA. When providing DR for Horizon-based services, it is important to consider the active or passive approach from the perspective of the user:
- With an active/active approach, services delivered to a particular user will be available on all sites.
- With an active/passive approach, only the primary site will be active, and in the event of a disaster, services will be available to the user on the secondary site.
Home Site and Scope policies defined in the global entitlement should align with the DR approach devised for your VDI environment. For example, in a VDI environment based on an active/active approach, the global entitlement policies are configured to search desktops from all the sites in the pod federation. Even if one of the sites is not available, the desktop can be searched and launched from the pods in the other active sites. In a VDI environment based on an active/passive DR approach, global entitlement policies should be set to launch desktops only from the pods in the active site. If there is an outage in the primary active site, services are enabled in the secondary passive site, with manual intervention, and desktops are launched from the secondary site. For more information about configuring global entitlement, see the VMware documentation.
Even though Horizon 7 CPA supports both approaches, the decision concerning the DR approach will depend on a variety of factors, including the availability requirements of business-critical applications, the distance between sites, the cost of the DR infrastructure, replication of data, and network connectivity.
In an active/active multi-site scenario, user data should be replicated synchronously across all sites in real-time to maintain application responsiveness and the user experience. It could be a challenge to replicate large data files across a Wide Area Network (WAN) without impacting the user experience. For example, consider a user who works with an application that processes large files. If the user is redirected to different active sites each time he logs in, he may either experience slowness or the application may not work properly because the associated data is not available in real-time in all sites. The speed of replication plays a more significant role under this combination of requirements - it depends on the distance between sites, the type of WAN network, available bandwidth, and so on. Real-time replication traffic can also consume a large chunk of your network bandwidth and may affect other traffic, including production. So, you must consider these implications before deciding on an active/active approach.
Alternatively, you can have a partial active/active DR approach, with 50% of users homed to one of the active sites. The other site will still be active, serving the other 50% of users. However, both data centers should have the capacity to run at 100% capacity, in the event of a disaster in one of the sites. With this approach, you can avoid the replication challenges described above. You can schedule the replication out of business hours on a daily or weekly basis without impacting critical production traffic. The user also gets a better experience as their desktops launch at the Home Site, where their data and profile reside.
An active/passive DR solution is the simplest approach in Horizon 7 CPA multi-site deployments because only one site is active at any given point of time. Users are assigned to global entitlements and are homed to the primary active site, while the secondary site remains passive. In this approach, you do not have to worry about the replication challenges because the desktop always launches on the Home Site, which is the primary or active site for users. However, data is replicated to the passive site based on the RPO requirements. In the event of a disaster, services in the primary site will be failed-over to the secondary passive site – with manual intervention, while global entitlement launches desktops on the secondary site.
Table 1 shows the limits of various components in CPA architecture for Horizon version 7.8 or later. For more information regarding limits refer to VMware Horizon 7 sizing limits and recommendations.
Table 1: Cloud Pod Limits in Horizon 7 CPA
Multi-Site Horizon 7 VDI with vSAN stretched clusters
Architectures based on VMware’s vSAN technology support active/passive multi-site deployment for Horizon 7 VDI. This is a truly active/passive approach that leverages a stretched cluster: one that extends a vSAN cluster across two sites or data centers. The solution builds on the VMware vSAN replication technology that replicates data across the sites involved in a stretched cluster and the VMware vSphere HA feature that provides high-availability across the hosts in a cluster. Horizon 7 management servers and desktop VMs are pinned to the active site using VMware vSphere Storage DRS, VM DRS, host DRS groups, and VM-Host affinity rules. In the event of a disaster in the active site, the VMs are failed-over and restarted in the passive site using VMware HA. The pinning of Horizon management and desktop VMs to storage and compute resources on a single site mean that these VMs only reside on a single site at any given time – either the originally active site in normal operation or the originally passive site after a DR event has occurred.
A vSAN stretched architecture consists of three fault domains called preferred (active site), secondary (passive site), and witness (third site). The latency between data sites should not be more than 5 ms round-trip-time (RTT). The latency between the data sites and the witness site should not be more than 200 ms RTT. vSAN communication between the data sites can be overstretched L2 or L3 networks, and vSAN communication between data sites and the witness site can be routed over L3. For details on a vSAN stretched, cluster-based DR approach see the VMware documentation.
A typical use-case when leveraging a vSAN stretched cluster DR approach is full-clone persistent desktop pools, where user data is tightly integrated with the desktop. vSAN storage cluster technology replicates VDI VMs across the sites in the stretched cluster in real-time. The architecture only supports data centers which are near to each other and connected via good network bandwidth with low latencies. In this DR approach, RTO is higher compared to a CPA-based DR solution because the management and desktop VMs need to be restarted at the passive site in the event of a disaster.
Dell EMC VxRail hyper-converged appliances based on VMware vSAN technology can run a multi-site Horizon 7 VDI environment using the vSAN stretched cluster architecture discussed above. For more details on deploying a VMware Horizon 7 VDI on VxRail appliances, see the Design Guide under "Designs for VMware Horizon on VxRail and vSAN Ready Nodes" on our VDI Info Hub for Ready Solutions website.
Getting your users back to being productive with a minimum loss of time and data is of vital importance in the event of a DR. With technologies where we can de-couple user data from a virtual desktop, Horizon 7 VDI disaster recovery boils down to how you handle the replication of user data across your sites while providing an equivalent set of VMs in other sites. For non-persistent desktop use-cases where user data is usually decoupled from the rest of the desktop, a Horizon CPA-based DR approach is recommended. If you are planning for an active/active DR solution, you should consider the replication challenges discussed earlier in this blog. For use-cases where full-clone persistent desktops cannot be converted to a non-persistent model for various business reasons, an active/passive DR approach based on vSAN stretched cluster architecture is the best fit.
In the next blog in this series, we will discuss the operational backup aspects of VDI data protection based on testing done by the Dell EMC Ready Solutions for VDI Engineering team. So, stay tuned!
Principal Engineer at Dell EMC, Technical Marketing ,Ready Solutions for VDI
Check out this article to learn about VDI - VMware Horizon 7 multi-site approaches. #iwork4dell #dellemc #vmware #vxrail
Related Blog Posts
A VMware Horizon solution on Dell EMC PowerEdge R7525 servers based on 2nd Gen AMD EPYC processors
Tue, 02 Jun 2020 09:37:59 -0000|
Read Time: 0 minutes
Many VDI deployments experience performance issues and poor user experience when trying to maintain a cost-effective consolidation ratio. A higher consolidation ratio of virtual machines to physical servers offers better economics and lower Total Cost of Ownership (TCO). The amount of TCO benefits might vary depending on the size of your VDI environment. It is a challenge for today’s organizations to deploy a cost-effective VDI environment while striking the right balance between performance and density.
The Dell Technologies Ready Solutions for VDI team provides a solution that resolves these challenges. It uses VMware Horizon based on Dell EMC PowerEdge R7525 servers equipped with new 2nd Gen AMD EPYC processors. The PowerEdge R7525 is a highly scalable, two-socket 2U rack server that delivers powerful performance and flexible configuration options. The servers are equipped with 2nd Gen AMD EPYC processors that can accommodate up to 64 cores per socket. A dual-socket R7525 server can have up to 128 cores, providing excellent user densities and a lower TCO for your VDI deployment. This solution offers you the flexibility to correctly size your VDI environment for performance and an exceptional end-user experience.
In this blog, we will discuss the key benefits of this solution and the results of performance testing carried out by the Dell Technologies Ready Solutions for VDI team.
Key benefits of the solution
- High performance and density: PowerEdge R7525 servers based on 2nd Gen AMD EPYC processors are designed for performance and with a high number of cores per CPU socket you can achieve higher user densities per server.
- Lower security risks with a diverse CPU architecture: The 2nd Gen AMD EPYC processors in this solution present an opportunity to diversify the CPU architecture within your data center. A data center with diverse CPU architecture poses a lower risk to your organization during security threats. Customers can move business-critical data to an appropriate and safe environment while a security event is resolved. With AMD Infinity Guard, which includes the AMD secure processor, Secure Memory Encryption (SME), and Secure Encrypted Virtualization (SEV) capabilities, you can minimize potential attack surfaces and deploy your workloads with confidence.
- Excellent graphics capability: The solution also offers excellent graphics performance with the capability of hosting up to 6 NVIDIA T4 cards (each with x16 PCIe lane access) on the PowerEdge R7525 server, providing up to 96 GB of graphics frame-buffer per server.
Solution performance testing
The Dell Technologies Ready Solutions for VDI team used the Login VSI benchmark tool for performance testing. We performed testing on a 3-node VMware vSAN cluster based on PowerEdge R7525 servers with a ‘Density Optimized’ configuration. VMware ESXi 6.7 update 3 was used as the hypervisor and the Horizon 7 virtual desktops were provisioned by instant clones. See Figure 1 for the solution stack.
Figure 1: VMware Horizon on PowerEdge R7525 solution stack
The environment configuration was:
- PowerEdge R7525 server (Density Optimized configuration)
- 2 x AMD EPYC 7502 (32 core @2.5 GHz)
- 1024 GB (16 x 64 GB @ 3200 MHz)
- 2 x 800 GB WI SAS SSD (cache)
- 4 x 1.92 TB MU SAS SSD (capacity)
- Mellanox Connect X- 5, 25 Gbe Dual port SFP28
- 6 x NVIDIA T4
- vSAN all-flash datastore
- VMware ESXi 6.7u3 hypervisor
- VMware Horizon 7.10 VDI software layer
See Table 1 for the VM configuration that we tested for different Login VSI workloads. For details of the test environment, configuration and testing process and an analysis of the test results, see the Reference Architecture Guide available on the Dell Technologies VDI Infohub.
Table 1 : Virtual machine configuration for different Login VSI workloads
Figure 2 shows the recommended density figures per host for Login VSI workloads based on our performance testing. We recommend these density figures after monitoring and analyzing a combination of host utilization parameters (CPU, memory, network and disk utilization) and Login VSI results. We monitored the relevant host utilization parameters and applied relatively conservative thresholds for the Login VSI testing. Thresholds are carefully selected to deliver an optimal combination of excellent end-user experience and cost-per user while also providing burst capacity for seasonal or intermittent spikes in usage.
Figure 2: Horizon on PowerEdge R7525 solution user density figures
Our performance testing achieved excellent consolidation ratios for the solution while maintaining good performance for typical VDI workloads. PowerEdge R7525 servers based on AMD processors come with dual-socket CPUs that can host up to 128 cores per server, increasing user density within VDI environment.
If you are running a mixed workload on your hypervisor, including your VDI workload, there is a limitation using VMware licensing greater than 32 cores. See the licensing details here. However, this limitation doesn't apply to VMware vSphere Desktop edition intended only for deploying desktop virtualization and is licensed based on powered-on desktop virtual machines.
The high CPU core density per server results in exceptional user densities and high performance for VDI workloads. The 2nd Gen AMD EPYC processors with high core counts present an opportunity to design your VDI environment with CPU oversubscription ratios that result in the right balance between performance and user density.
VDI Data Protection - Part 3: An Operational Backup Approach for Horizon 7
Fri, 03 Apr 2020 14:54:31 -0000|
Read Time: 0 minutes
In Part 1 of this blog series we discussed how disaster recovery and operational backup are two significant aspects of Virtual Desktop Infrastructure (VDI) data protection. In this blog, we will discuss the operational backup aspects of VMware Horizon data protection. For details on disaster recovery, see Part 2.
Loss of VDI environment availability or data has the potential to degrade a user’s ability to perform daily operational tasks. So, it is important for organizations to have an optimal plan to back up and recover VDI data. A robust data protection plan should meet the availability, Recovery Time Objective (RTO), and Recovery Point Objective (RPO) targets defined in Service Level Agreements (SLAs).
For a VMware Horizon virtual desktop environment, three key component layers require protection:
- The desktop layer, that is, the user’s desktop (which is often made available to multiple users using an appropriate provisioning technology)
- The management layer (which performs the provisioning, brokering, policy management, and related management functions)
- The user data layer (stored in user profile shares, home folders, and so on)
The backup and recovery requirements of each component layer depend on the type of the desktop pools and provisioning method used in the Horizon 7 environment. For example, a persistent (stateful) desktop pool can be created with full clones or full virtual machines, which requires a full backup of the virtual machines. A persistent pool can also be created with Horizon instant clones or linked clones with App Volumes (App Stacks and User Writable Volumes) to store the user-installed apps and user-related data. In this scenario, the master image of the desktop and the persistent data related to App Volumes need protection.
For a non-persistent (stateless) desktop pool, only the master image of the desktop needs to be protected. In the case of non-persistent desktops, you should consider protecting the user data that is stored in user profile shares and home folders, based on the user environment.
Figure 1: Horizon 7 Operational Backup Approach
Dell EMC offers comprehensive backup and recovery solutions that include products like Integrated Data Protection Appliances (IDPA), Avamar, Data Domain, and Data Protection Suite. For the data protection of a Horizon 7 environment, you can choose from this broad range of Dell EMC data protection products to match your user environment and existing data protection regime. For further information, visit the Dell EMC Data Protection web page.
The Dell EMC Ready Solutions for VDI team has published an operations guide that outlines how Avamar Virtual Edition (AVE) and Data Domain Virtual Edition (DD VE) can be used to facilitate backup and recovery of a Horizon 7, non-persistent desktop pool provisioned by instant clone technology. AVE and DD VE are the software-defined versions of the industry-leading Dell EMC data protection products Avamar and Data Domain. Avamar facilitates fast and efficient backup and recovery for a Horizon environment. Variable-length data deduplication, a key feature of Avamar data protection software, reduces network traffic significantly and provides better storage efficiency. Data Domain provides backup as well as archival capabilities. Data Domain’s tight integration with Avamar delivers added performance and scalability advantages for large Horizon 7 environments. Let’s see some of the key points discussed in the operation guide for backup and recovery of Horizon 7 desktop, management, and user layers.
The Horizon 7 configuration details are in the management layer stored in a View LDAP repository as part of the connection server configurations details. To schedule backups of this database, select the connection server instance from the Horizon console to generate a configuration backup file in a file share. You can then use Avamar VE to back up and restore this configuration backup file. If you are using linked clones, you also need to back up the Composer database.
As discussed earlier in this blog, the backup requirements of the desktop layer depend on the desktop pools and provisioning method. In the case of Horizon instant clones, only the master image (golden image) of the respective desktop pools need to be backed up. We recommend taking a clone of the original master image (containing snapshots) and use that copied cloned image for the backup cycles.
The user data layer contains user-profile shares and other user-related files that are backed up by Avamar software. This layer needs to be protected using a standard data protection approach that is appropriate for user data in any environment.
For a more detailed description of the process to protect each of the layers described above, refer to the operations guide published by the Dell EMC Ready Solutions for VDI team.
The backup and recovery approach for Horizon virtual desktop environments is different from the approach followed for physical desktops and other virtual machines. For developing a successful operational backup strategy for Horizon, the key thing to be aware of is that all three component layers (desktop, management, user data) must be considered. The successful recovery of each of these interdependent components is essential to restore and deliver a fully functional user desktop. To make sure that your backup and recovery plan is effective from a user and business perspective, we recommend that you perform a backup and recovery test for all three layers simultaneously.
In the next part, we will conclude the blog series with some discussion on multi-cloud and hybrid cloud strategies for Horizon 7. So, stay tuned for more!
Thanks for Reading,
Anand Johnson - On Twitter @anandjohns