Home > Integrated Products > VxRail > White Papers > Protecting Workloads with Dell EMC VxRail and VMware Cloud Disaster Recovery > On-demand disaster recovery
Many businesses with lean IT teams are simply unable to perform all the work required to maintain a DR site, from patching and updating to failover testing. DRaaS takes the burden of maintaining DR sites from the organization and puts it into the hands of experts in disaster recovery. It can also be much more affordable than hosting your own DR infrastructure in another region with servers running in air-conditioned rooms that no one ever visits, waiting for a disaster to strike. If a disaster doesn’t happen, that expensive second infrastructure, a capex investment, just ages. Even if that second datacenter is used for other purposes besides DR capacity, it can be very time consuming to maintain compatibility with the main datacenter.
VMware Cloud Disaster Recovery offers on-demand disaster recovery not only to large IT organizations, but to smaller organizations who may lack resources (staff and equipment). It is intended for organizations that need services resiliency and face complex, expensive, and unreliable DR. VMware Cloud Disaster Recovery also helps security and compliance teams ensure that operations can resume after a disaster event. Delivered as an easy-to-use software-as-a-service (SaaS) solution with cloud economics, VMware Cloud Disaster Recovery combines cost-efficient cloud storage with simple SaaS-based management for IT resiliency at scale. Customers benefit from consistent, familiar VMware operations across production and DR sites, a pay-when-you-need-failover capacity model for DR resources, and instant power-on capabilities for fast recovery after disaster events.
The structure of VMware Cloud Disaster Recovery consists of a DRaaS Connector, a SaaS Orchestrator, Scale-out Cloud File System, and an On-Demand Failover Target. We will look at the details of how each of these is used to deliver a holistic DRaaS solution.
The DRaaS Connector is deployed in VxRail and other vSphere environments. It provides the data protection service connection to the SaaS Orchestrator and Scale-out Cloud Filesystem. The connector is also responsible for triggering replicating copies of virtual machines (VMs) in the environment based on the protection schedule that is defined. All local activities are carried out through the DRaaS Connector vApp.
VMs can be included in a protection group by using one of three methods:
This enables new VMs to be protected automatically after they are created, making VM protection straightforward.
Replications initiated by the DRaaS Connector use change block tracking (CBT) technology. This ensures that only the parts of a VM that have been created or modified are transmitted as part of the backup instead of the entire VM. CBT can reduce the overall amount of data transmitted to the Scale-out Cloud File System. The exception to this is that the first time a VM is protected, the entire VM must be transmitted.
The DRaaS Connector is managed by the SaaS Orchestrator. The SaaS Orchestrator is a cloud-based service that controls scheduling, recovery, and other DR operations. The SaaS Orchestrator simplifies DR maintenance operations, eliminating the customer burden of lifecycle management for DR software. It can scale up to 1,500 VMs across multiple SDDC clusters. Compliance Checks occur every 30 minutes, increasing the confidence that the DR plan will work when needed. DR Plans can be run either as a failover, or as a 'Test Failover,’ which performs all the plan’s recovery operations in a test site for validation. Finally, VMware Cloud Disaster Recovery automatically generates detailed reports for events such as tests and failover, to comply with internal organizational policies and regulatory compliance requirements.
Compliance checks run automatically to verify the integrity of all constraints and definitions listed in the DR plan. For example, if the DR plan relies on replication of protection groups between different sites, compliance checks continuously monitor replication health and alert the administrator if, for example, replication stops because of the erroneous application of firewall rules. An alert is triggered at the time the firewall change is implemented, which provides a chance to address the firewall problem immediately instead of waiting for the next DR testing cycle to expose this problem.
Compliance checks also monitor many other components at protected and recovery sites. They perform integrity checks on the DR plan to make sure that referenced objects such as VMs, datastores, or virtual networks continue to exist and remain healthy. Below is an example of a continuous compliance report for a healthy plan.
The DRaaS Connector connects to the Scale-out Cloud File System. It runs on cloud object storage, and stores replication point copies. These copies are immutable and encrypted for security. VMs are kept in the native vSphere format, making recovery quicker. You do not have to copy the replicas from the Scale-out Cloud File System to the VMware Cloud primary storage system before you power on the VMs. You can instantly power on the VMs as soon as the SDDC capacity is available and settings are configured. Also known as Live Mount technology, this allows the VM to be restarted without first copying the VM to SDDC primary storage. This also makes testing much simpler to carry out, as the VM can immediately start on the On-Demand Failover Target.
The On-Demand Failover Target is a VMware Cloud on AWS instance. That means VMs running in a local software defined data center (SDDC), such as those built on VxRail with VMware Cloud Foundation, can easily be recovered in the event of a site failure, ransomware attack, or other DR scenario. Both VxRail and VMware Cloud on AWS use the same virtualization technology. Because of this, it is easy to keep the primary site up to date and match the version of the On-Demand Failover Target, especially because it is easy to keep VxRail updated. Therefore, compatibility issues between versions are substantially reduced, allowing IT staff to focus on recovery instead of versioning during a disaster.
A faster recovery scenario is possible where SDDC clusters have reserved resources available to begin a DR at a moment’s notice. This is available with the Pilot Light option of VMware Cloud Disaster Recovery. Pilot Light reserves resources in the On-Demand Failover Target and allows for configuration of critical services, like network resources and other service mappings, so they are already running. This reduces the recovery steps that must be completed before applications can be brought back online.
These factors create the basis for a simple and straightforward recovery. The elasticity of cloud computing means that there are only charges for the compute resources in use during a DR failover or DR test, reducing the cost of DR. Leveraging the cloud for DR eliminates the need for idle equipment that is being paid for but not used.