Dell PowerStore Enables Kubernetes Stretched Clusters
Wed, 24 Apr 2024 12:50:42 -0000
|Read Time: 0 minutes
Kubernetes (K8s) is one of the hottest platforms for building enterprise applications. Keeping enterprise applications online is a major focus for IT administrators. K8s includes many features to provide high availability (HA) for enterprise applications. Dell PowerStore and its Metro volume feature can make K8s availability even better!
Enhance local availability
K8s applications should be designed to be as ephemeral as possible. However, there are some workloads such as databases that can present a challenge. If these workloads are restarted, it can cause an interruption to applications, impacting service levels.
Deploying K8s on VMware vSphere adds a layer of virtualization that allows a virtual machine, in this case a K8s node, to be migrated live (vMotion) to another host in the cluster. This can keep pods up and running and avoid a restart when host hardware changes are required. However, if those pods have a large storage footprint and multiple storage appliances are involved, storage migrations can be resource and time consuming.
Dell PowerStore Metro Volume provides synchronous data replication between two volumes on two different PowerStore clusters. The volume is an identical, active-active copy on both PowerStore clusters. This allows compute-only virtual machine migrations. Compute-only migrations occur much faster and are much more practical in most cases. Therefore, more workloads can take advantage of vMotion and availability is increased.
PowerStoreOS 3.6 introduces a witness component to the Metro Volume architecture. The functional design of the witness adds more resiliency to Metro Volume deployments and further mitigates the risk of split-brain situations. The witness enables PowerStore OS 3.6 to make intelligent decisions across a wider variety of infrastructure outage scenarios, including unplanned outages.
K8s stretched or geo clusters
Spreading an application cluster across multiple sites is a common design for increasing availability. The compute part is easy to solve because K8s will restart workloads on the remaining nodes, regardless of location. However, if the workload requires persistent storage, the storage needs to exist in the other site.
PowerStore Metro Volume solves this requirement. Metro Volume support for VMware ESXi synchronizes volumes across PowerStore clusters to meet latency and distance requirements. In addition to the enhanced vMotion experience, PowerStore Metro volume provides active-active storage to VMware VMFS datastores that can span two PowerStore clusters. For in-depth information about PowerStore Metro Volume, see the white paper Dell PowerStore: Metro Volume.
Lab testing
We tested Dell PowerStore Metro Volume with a SQL Server workload driven by HammerDB on a stretched K8s cluster running on vSphere with three physical hosts and two PowerStore clusters[1]. The K8s cluster was running Rancher RKE2 1.25.12+rke2r1 with a VMFS datastore on PowerStore Metro volume using the VMware CSI provider for storage access. We performed vMotion compute only migrations and simulated storage network outages as part of the testing.
During the testing, the synchronized active-active copy of the volume was able to assume the IO workload, maintain IO access, and keep SQL Server and the HammerDB workload online. This prevented client disconnects and reconnects, application error messages, and costly recovery time to synchronize and recover data.
After we successfully completed testing on Rancher, we pivoted to another K8s platform: a VMware Tanzu Kubernetes Cluster deployed on VMware vSphere 8 Update 1. We deployed the SQL Server and HammerDB workload and performed a number of other K8s deployments in parallel. Workload test results were consistent. When we took the PowerStore cluster that was running the workload offline, both compute and storage remained available. The result was that the containerized applications were continuously available: not only during the failover, but during the failback as well.
In our Tanzu environment, Metro Volume went beyond data protection alone. It also provided infrastructure protection for objects throughout the Workload Management hierarchy. For example, the vSphere Tanzu supervisor cluster control plane nodes, pods, Tanzu Kubernetes clusters, image registry, and content library can all be assigned a VM storage policy and a corresponding storage class which is backed by PowerStore Metro Volumes. Likewise, NSX Manager and NSX Edge networking components on Metro Volume can also take advantage of this deployment model by remaining highly available during an unplanned outage.
Figure 1. Metro Volume with a witness adds portability and resiliency to Tanzu deployments
For more information about PowerStore Metro Volume, increasing availability on SQL Server, and other new features and capabilities, be sure to check out all the latest information on the Dell PowerStore Info Hub page.
Authors:
Doug Bernhardt, Sr. Principal Engineering Technologist, LinkedIn
Jason Boche, Sr. Principal Engineering Technologist, LinkedIn
[1] Based on Dell internal testing, conducted in September 2023.
Related Blog Posts
PowerStore validation with Microsoft Azure Arc-enabled data services updated to 1.25.0
Mon, 12 Feb 2024 20:04:34 -0000
|Read Time: 0 minutes
Microsoft Azure Arc-enabled data services allow you to run Azure data services on-premises, at the edge, or in the cloud. Arc-enabled data services align with Dell Technologies’ vision, by allowing you to run traditional SQL Server workloads on Kubernetes, on your infrastructure of choice. For details about a solution offering that combines PowerStore and Microsoft Azure Arc-enabled data services, see the white paper Dell PowerStore with Azure Arc-enabled Data Services.
Dell Technologies works closely with partners such as Microsoft to ensure the best possible customer experience. We are happy to announce that Dell PowerStore has been revalidated with the latest version of Azure Arc-enabled data services, 1.25.0.
Deploy with confidence
One of the deployment requirements for Azure Arc-enabled data services is that you must deploy on one of the validated solutions. At Dell Technologies, we understand that customers want to deploy solutions that have been fully vetted and tested. Key partners such as Microsoft understand this too, which is why they have created a validation program to ensure that the complete solution will work as intended.
By working through this process with Microsoft, Dell Technologies can confidently say that we have deployed and tested a full end-to-end solution and validated that it passes all tests.
The validation process
Microsoft haspublished tests for their continuous integration/continuous delivery (CI/CD) pipeline that partners and customers to run. For Microsoft to support an Arc-enabled data services solution, it must pass these tests. At a high level, these tests perform the following:
- Connect to an Azure subscription provided by Microsoft.
- Deploy the components for Arc-enabled data services, including SQL Managed Instance, using both direct and indirect connect modes.
- Validate Kubernetes (K8s), hosts, storage, container storage interface (CSI), and networking.
- Run Sonobuoy tests ranging from simple smoke tests to complex high-availability scenarios and chaos tests.
- Upload results to Microsoft for analysis.
When Microsoft accepts the results, they add the new or updated solution to their list of validated solutions. At that point, the solution is officially supported. This process is repeated as needed as new component versions are introduced. Complete details about the validation testing and links to the GitHub repositories are available here.
More to come
Stay tuned for more additions and updates from Dell Technologies to the list of validated solutions for Azure Arc-enabled data services. Dell Technologies is leading the way on hybrid solutions, proven by our work with partners such as Microsoft on these validation efforts. Reach out to your Dell Technologies representative for more information about these solutions and validations.
Author: Doug Bernhardt
Sr. Principal Engineering Technologist
Hybrid Kubernetes Clusters with PowerStore CSI
Fri, 26 Apr 2024 17:47:47 -0000
|Read Time: 0 minutes
In today’s world and in the context of Kubernetes (K8s), hybrid can mean many things. For this blog I am going to use hybrid to mean running both physical and virtual nodes in a K8s cluster. Often, when we think of a K8s cluster of multiple hosts, there is an assumption that they should be the same type and size. While that simplifies the architecture, it may not always be practical or feasible. Let’s look at an example of using both physical and virtual hosts in a K8s cluster.
Necessity is the mother of invention
When you need to get things done, often you will find a way to do it. This happened on a recent project at Dell Technologies where I needed to perform some storage testing with Dell PowerStore on K8s, but I didn’t have enough physical servers in my environment for the control plane and the workload. I knew that I wanted to run my performance workload on my physical servers and knowing that the workload of the control plane would be light, I opted to run them on virtual machines (VMs). The additional twist is that I also wanted additional worker nodes, but I didn’t have enough physical servers for everything. The goal was to run my performance workload on physical servers and allow everything else to run on VMs.
Dell PowerStore CSI to the rescue!
My performance workload that I am running on physical hosts was also using Fibre Channel storage. This adds a bit of a twist for workloads running on virtual machines if I were to present the storage uniformly to all the hosts. However, using the features of Dell PowerStore CSI and Kubernetes, I don’t need to do that. I can simply present Dell PowerStore storage with Fibre Channel to my physical hosts and run my workload there.
The following is a diagram of my infrastructure and key components. There is one physical server running VMware ESXi that hosts several VMs used for K8s nodes, and then three other physical servers that run as physical nodes in the cluster.
What kind of mess is this?!?
As the reader, you’re probably thinking…what kind of hodge-podge maintenance nightmare is this? I have K8s nodes that aren’t all the same and then some hacked up solution to make it work?!? Well, it’s not a mess at all, allow me to explain how it’s quite simple and elegant.
For those new to K8s, implementing something like this probably seems very complicated and hard to manage. After all, the workload should only run on the physical K8s nodes that are connected though Fiber Channel. Outside of K8s, Dell CSI, and the features they provide, it likely would be a mess of scripting and dependency checking.
An elegant solution!
In this solution I leveraged the labels and scheduling features of K8s with the PowerStore CSI features to implement a simple solution to accomplish this. This implementation is very clean and easy to maintain with no complicated scripts or configuration to maintain.
Step 1 – PowerStore CSI Driver configuration
As part of the PowerStore CSI driver configuration, one of the supported features (node selection) is the ability to select the nodes on which the K8s pods (in this case the CSI driver) will run, by using K8s labels. In the following figure, in the driver configuration, I specify that the PowerStore CSI driver should only run on nodes that contain the label “fc=true”. The label itself can contain any value; the key is that this value must match in a search.
The following is an excerpt from the Dell PowerStore CSI configuration file showing how this is done.
This is a one-time configuration setting that is done during Dell CSI driver deployment.
Step 2 – Label the physical nodes
The next step is to apply a label “fc=true” to the nodes that contain a Fibre Channel configuration on which we want the node to run. It’s as simple as running the command “kubectl label nodes <your-node-name> fc=true”. When this label is set, the CSI driver pods will only run on K8s nodes that contain this label value.
This label only needs to be applied when adding new nodes to the cluster or if you were to change the role of this node and remove it from this workload.
Step 3 – Let Kubernetes do its magic
Now, I leverage basic K8s functionality. Kubernetes resource scheduling evaluates the resource requirements for a pod and will only schedule on the nodes that meet those requirements. Storage volumes provided by the Dell PowerStore CSI driver are a dependency for my workload pods, and therefore, my workload will only be scheduled on K8s nodes that can meet this dependency. Because I’ve enabled the node selection constraint for the CSI driver only on physical nodes, they are the only nodes that can fill the PowerStore CSI storage dependency.
The result of this configuration is that the three physical nodes that I labeled are the only ones that will accept my performance workload. It’s a very simple solution that requires no complex scripting or configuration.
Here is that same architecture diagram showing the nodes that were labeled for the workload.
Kubernetes brings lots of exciting new capabilities that can provide elegant solutions to complex challenges. Our latest collaboration with Microsoft utilized this architecture. For complete details, see our latest joint white paper: Dell PowerStore with Azure Arc-enabled Data Services which highlights performance and scale.
Also, for more information about Arc-enabled SQL Managed Instance and PowerStore, see:
- the Microsoft blog post: Performance benchmark of Azure Arc-enabled SQL Managed Instance
- the Microsoft digital events Microsoft Build and Azure Hybrid, Multicloud, and Edge Day
Author: Doug Bernhardt
Sr. Principal Engineering Technologist