It’s Time to Expect Flexible Disaster Recovery
Wed, 13 Oct 2021 20:50:54 -0000|
Read Time: 0 minutes
Rigid and complex disaster recovery (DR) can be a thing of the past with Dell EMC Integrated System for Microsoft Azure Stack HCI.
When data Is currency, DR is non-negotiable
If your organization is like many others—of any size—it relies increasingly on data to thrive. This is particularly true for businesses that are on track to modernize their infrastructure and application architectures. For those organizations, data and the workloads that process it are truly the lifeblood of the business.
When business relies on data to function, recovery-point objectives (RPOs) and recovery-time objectives (RTOs) must be as low as possible. However, legacy disaster recovery (DR) solutions are complex to design and maintain, and they might require manual intervention during a DR scenario. These solutions can also be costly, especially if you must maintain a dedicated DR site. That’s why a flexible and performant DR solution is a crucial part of infrastructure modernization.
Stretch clustering could be the answer
Today, enterprise organizations are consolidating, refreshing, and modernizing their aging virtualization platforms with hyperconverged infrastructure (HCI). HCI architectures help customers achieve a highly automated and orchestrated cloud-operations experience. The architectures are designed to deliver high levels of performance and scalability with software-defined compute, storage and networking. HCI solutions are also designed to simplify the implementation of high availability and DR for workloads running in virtual machines (VMs) and containers.
What if you could stretch a single HCI cluster across two locations as a DR solution? That would simplify and accelerate DR. Such a solution is now within reach using Microsoft Azure Stack HCI, version 20H2 or later. Azure Stack HCI includes built-in stretch clustering capabilities, which use Storage Replica for volume replication. Stretch clustering allows organizations to split a single HCI cluster across two locations, whether they be rooms, buildings, cities or regions. It provides automatic failover of Microsoft Hyper-V VMs if a site failure occurs.
In general, stretch clustering on Azure Stack HCI is an ideal DR solution for scenarios like these:
- Introducing automatic failover with orchestration for recovery of a web-based application’s front-end server tier after a disaster at a hosting location
- Distributing primary and secondary instances of an infrastructure’s core services, such as Microsoft Active Directory, across two physical locations
- Hosting applications with lower write input/output (I/O) performance characteristics
- Running file-system-based services and other business services that can tolerate being hosted on crash-consistent volumes
- Running database workloads such as Microsoft SQL Server, which often cannot sustain the loss of even a single transaction, where using application-layer recoverability solutions such as SQL Always On availability groups might be more appropriate
Putting the solution to the test
Dell Technologies engineers conducted proof-of-concept (PoC) tests to show how Dell EMC Integrated System for Azure Stack HCI with stretch clustering can handle VM and volume placement. We also wanted to observe the impact of a real running application (Dell EMC OpenManage Enterprise) during failover scenarios. Each of the four nodes (two per site) in our testing environment included two Intel® Xeon® Gold 6230R processors and 384 GB of memory, running Azure Stack HCI, version 20H2.
We tested the following scenarios and observed the outcomes listed. For full details, read the white paper, Adding Flexibility to DR Plans with Stretch Clustering for Azure Stack HCI.
- Unplanned cluster-node failure: All VMs fully restarted on the second node at the same site in about 5 minutes.
- Unplanned site failure: Affected VMs moved and came fully back online in 15–20 minutes.
- Planned site failover: The OpenManage Enterprise application was reachable from the client device within 3 minutes of the live migration to site 2.
- Lifecycle management: Applying the BIOS, firmware and driver updates to the stretched cluster took approximately three hours, and the process had no impact on the Dell EMC OpenManage Enterprise (OME) application.
An accelerated path to simple DR
Dell Technologies offers a broad portfolio of solution configurations designed to meet the requirements of any workload. The solution for DR built on Dell EMC Integrated System for Azure Stack HCI features intelligently designed AX nodes from Dell Technologies configurations. Dell engineers validate every component of these configurations, including firmware and driver versions. Additionally, Dell ProSupport technicians know the entire solution, from hardware to operating system to Microsoft Storage Spaces Direct to networking. They can help keep the cluster operating at peak performance and availability.
To see the full details of our tests and to learn more about the stretch clustering capability in Azure Stack HCI, read the white paper, Adding Flexibility to DR Plans with Stretch Clustering for Azure Stack HCI.
Related Blog Posts
Virtualize Demanding Applications with a Dell EMC Integrated System for Microsoft Azure Stack HCI
Wed, 13 Oct 2021 20:50:59 -0000|
Read Time: 0 minutes
Break through performance barriers
If your organization is on the road to infrastructure modernization, chances are good that your underlying legacy virtualization clusters are being stretched to their limits. This could mean suboptimal performance and resiliency, which can make it difficult to scale clusters and meet service-level agreements (SLAs).
In addition, with overtaxed and aging clusters, you can’t virtualize applications that you would like to because of performance requirements, which can mean a larger data center footprint and higher corresponding power and cooling costs.
If you’re thinking about refreshing and modernizing your legacy virtualization environments, you might want to consider a Dell EMC Integrated System for Microsoft Azure Stack HCI.
This all-in-one validated hyperconverged infrastructure (HCI) solution includes full-stack lifecycle management, native integration into Microsoft Azure, flexible consumption models and solution-level enterprise support and services expertise. Dell EMC Integrated System for Azure Stack HCI is available in a broad range of configurations, and it include engineering-validated AX nodes and networking topologies with Dell EMC PowerSwitch network switches. This design and validation can help ensure that every component—including firmware and driver versions—is optimized for demanding workloads.
Dell Technologies performed synthetic workload testing on one of these systems to see how it performed with highly demanding real-world application profiles. The cluster included four AX-7525 nodes, each populated with two 64-core AMD EPYC™ 7742 processors, 24 NVM Express (NVMe) drives (PCIe Gen4) and 100 gigabit Ethernet (GbE) remote direct memory access (RDMA) networking. Dell Technologies tested workloads under these conditions:
- Healthy cluster running 64 virtual machines (VMs) per node
- Healthy cluster running 32 VMs per node
- Degraded cluster with one node failed
- Degraded cluster with two nodes failed
The configuration delivered outstanding results in all tested scenarios, even when the cluster was in a degraded condition. This means that end users will not notice reduced response times, even if it takes IT longer to return the cluster to its fully operational state. You’ll find all the testing details and results in this white paper.
Reasons to believe
When you modernize your virtualization clusters by deploying Dell EMC Integrated System for Azure Stack HCI, you can:
- Virtualize demanding applications that historically needed to remain on physical servers because of their performance requirements.
- Take advantage of a performant, modernized infrastructure that can support the most demanding business services.
- Save precious real estate in the data center by minimizing the cluster size required to deliver performance SLAs.
- Accelerate online transaction processing (OLTP) workloads and improve end-user response times for database applications.
- Achieve fast times to insight with exceptional online analytical processing (OLAP) performance.
- Deliver high throughput at low latency, which means outstanding performance for applications like Microsoft SQL Server.
- Monitor and manage many aspects of the cluster with Dell EMC OpenManage Integration with Microsoft Windows Admin Center. This tool includes one-click, full-stack lifecycle management with Cluster-Aware Updating, dynamic CPU core management, automated cluster creation and cluster expansion.
To see our full test environment details and results and to learn more about Dell EMC Integrated System for Azure Stack HCI, download the white paper, Crash Through Workload Performance Boundaries with Azure Stack HCI.
Experts Recommend Automation for a Healthier Lifestyle
Wed, 20 Oct 2021 19:54:36 -0000|
Read Time: 0 minutes
Like any good techie, I can get a little obsessed with gadgets that improve my quality of life. Take, for example, my recent discovery of wearable technology that eases the symptoms of motion sickness. For most of my life, I’ve had to take over-the-counter or prescription medicine when boating, flying, and going on road trips. Then, I stumbled across a device that I could wear around my wrist that promised to solve the problem without the side effects. Hesitantly, I bought the device and asked a friend to drive like a maniac around town while I sat in the back seat. It actually worked – no headache, no nausea, and no grogginess from meds! Needless to say, I never leave home without my trusty gizmo to keep motion sickness at bay.
Throughout my career in managing IT infrastructure, stress has affected my quality of life almost as much as motion sickness. There is one responsibility that has always caused more angst than anything else: lifecycle management (LCM). To narrow that down a bit, I’m specifically talking about patching and updating IT systems under my control. I have sometimes been derelict in my duties because of annoying manual steps that distract me from working on the fun, highly visible projects. It’s these manual steps that can cause the dreaded DU/DL (data unavailable or data loss) to rear its ugly head. Can you say insomnia?
Innovative technology to the rescue once again! While creating a demo video last year for our Dell EMC OpenManage Integration with Microsoft Windows Admin Center (OMIMSWAC), I was blown away by how easy we made the BIOS, firmware, and driver updates on clusters. The video did a pretty good job of showing the power of the Cluster-Aware Updating (CAU) feature, but it didn’t go far enough. I needed to quantify its full potential to change an IT profressional’s life by pitting an OMIMSWAC’s automated, CAU approach against a manual, node-based approach. I captured the results of the bake off in Dell EMC HCI Solutions for Microsoft Windows Server: Lifecycle Management Approach Comparison.
For this white paper to really stand the test of time, I knew I needed to be very clever to compare apples-to-apples. First, I referred to HCI Operations Guide—Managing and Monitoring the Solution Infrastructure Life Cycle, which detailed the hardware updating procedures for both the CAU and node-based approaches. Then, I built a 4-node Dell EMC HCI Solutions for Windows Server 2019 cluster, performed both update scenarios, and recorded the task durations. We all know that automation is king, but I didn’t expect the final tally to be quite this good:
- The automated approach reduced the number of steps in the process by 82%.
- The automated approach required 90% less of my focused attention. In other words, I was able to attend to other duties while the updates were installing.
- If I was in a production environment, the maintenance window approved by the change control board would have been cut in half.
- The automated process left almost no opportunity for human error.
As you can see from the following charts taken from the paper, these numbers only improved as I extrapolated them out to the maximum Windows Server HCI cluster size of 16 nodes.
I thought these results were too good to be true, so I checked my steps about 10 times. In fact, I even debated with my Marketing and Product Management counterparts about sharing these claims with the public! I could hear our customers saying, “Oh, yeah, right! These are just marketecture hero numbers.” But in this case, I collected the hard data myself. I am still confident that these results will stand up to any scrutiny. This is reality – not dreamland!
Just when I thought it couldn’t get any better
So why am I blogging about a project I did last year? Just when I thought the testing results in the white paper couldn’t possibly get any better, Dell EMC Integrated System for Microsoft Azure Stack HCI came along. Azure Stack HCI is Microsoft’s purpose-built operating system delivered as an Azure service. The current release when writing this blog was Azure Stack HCI, version 20H2. Our Solution Brief provides a great overview of our all-in-one validated HCI system, which delivers efficient operations, flexible consumption models, and end-to-end enterprise support and services. But what I’m most excited about are two lifecycle management enhancements – 1-click full stack LCM and Kernel Soft Reboot – that will put an end to the old adage, “If it looks too good to be true, it probably is.”
Let’s invite OS updates to the party
OMIMSWAC was at version 1.1 when I did my testing last year. In that version, the CAU feature focused on the hardware – BIOS, firmware, and drivers. In OMIMSWAC v2.0, we developed an exclusive snap-in to Microsoft’s Failover Cluster Tool Extension to create 1-click full stack LCM. Only available for clusters running Azure Stack HCI, a simple workflow in Windows Admin Center automates not only the hardware updates – but also the operating system updates. How do I see this feature lowering my blood pressure?
- Applying the OS and hardware updates can typically require multiple server reboots. With 1-click full stack LCM, reboots are delayed until all updates are installed. A single reboot per node in the cluster results in greater time savings and shorter maintenance windows.
- I won’t have to use multiple tools to patch different aspects of my infrastructure. The more I can consolidate the number of management tools in my environment, the better.
- A simple, guided workflow that tightly integrates the Microsoft extension and OMIMSWAC snap-in ensures that I won’t miss any steps and provides one view to monitor update progress.
- The OMIMSWAC snap-in provides necessary node validation at the beginning of the hardware updates phase of the workflow. These checks verify that my cluster is running validated AX nodes from Dell Technologies and that all the nodes are homogeneous. This gives me peace of mind knowing that my updates will be applied successfully. I can also rest assured that there will be no interruption to the workloads running in my VMs and containers since this feature leverages CAU.
- The hardware updates leverage the Microsoft HCI solution catalog from Dell Technologies. Each BIOS, firmware, and driver in this catalog is validated by our engineering team to optimize the Azure Stack HCI experience.
The following screen shots were taken from the full stack CAU workflow. The first step indicates which OS updates are available for the cluster nodes.
Node validation is performed first before moving forward with hardware updates.
If the Windows Admin Center host is connected to the Internet, the online update source approach obtains all the systems management utilities and the engineering validated solution catalog automatically. If operating in an edge or disconnected environment, the solution catalog can be created with Dell EMC Repository Manager and placed on a file server share accessible from the cluster nodes.
The following image shows a generated compliance report. All non-compliant components are selected by default for updating. After this point, all the OS and non-compliant hardware components will be updated together with only a single reboot per node in the cluster and with no impact to running workloads.
Life is too short to wait for server reboots
Speaking of reboots, Kernel Soft Reboot (KSR) is a new feature coming in Azure Stack HCI, version 21H2 that also has the potential to make my white paper claims even more jaw dropping. KSR will give me the ability to perform a “software-only restart” on my servers – sparing me from watching the paint dry during those long physical server reboots. Initially, the types of updates in scope will be OS quality and security hotfixes since these don’t require BIOS/firmware initialization. Dell Technologies is also working on leveraging KSR for the infrastructure updates in a future release of OMIMSWAC.
KSR will be especially beneficial when using Microsoft’s CAU extension in Windows Admin Center. The overall time savings using KSR multiplies for clusters because faster restarts means less resyncing of data after CAU resumes each cluster node. Each node should reboot with Mach Speed if there are only Azure Stack HCI OS hotfixes and Dell EMC Integrated System infrastructure updates that do not require the full reboot. I will definitely be hounding my Product Managers and Engineering team to deliver KSR for infrastructure updates in our OMIMSWAC extension ASAP.
Bake off rematch
I decided to hold off on doing a new bakeoff until Azure Stack HCI, version 21H2 is released with KSR. I also want to wait until we bring the benefits of KSR to OMIMSWAC for infrastructure updates. The combination of OMIMSWAC 1-click full stack CAU and KSR will continue to make OMIMSWAC unbeatable for seamless lifecycle management. This means better outcomes for our organizations, improved blood pressure and quality of life for IT pros, and more motion-sickness-free adventure vacations. I’m also looking forward to spending more time learning exciting new technologies and less time with routine administrative tasks.
If you’d like to get hands-on with all the different features in OMIMSWAC, check out the Interactive Demo in Dell Technologies Demo Center. Also, check out my other white papers, blogs, and videos in the Dell Technologies Info Hub.