Thu, 09 Nov 2023 16:07:41 -0000
|Read Time: 0 minutes
UPDATE 11/7/2023: This blog and the embedded YouTube videos were published after Dell APEX Cloud Platform for Microsoft Azure was first announced at Dell Technologies World 2023. It contains early preview content. Please proceed to the following links to see the most up-to-date collateral and YouTube demo videos created after the platform was generally available Sept. 2023. https://www.youtube.com/playlist?list=PL2nlzNk2-VMEkNM7E8m0ia_lLHWlOuT5h |
It was another exhilarating Dell Technologies World (DTW) back in May. It’s always fun connecting with colleagues, customers, and partners in Las Vegas. As always, Vegas managed to surprise me with something I’d never seen before. I finally witnessed the incredible iLuminate team up close and personal at the APEX After Dark party. I tried to describe the phenomenon to a friend who hasn’t experienced one of their performances, but words cannot adequately convey this mesmerizing spectacle of sight and sound! In the end, only one of my photos from the event and a link to one of their recorded shows could make it real for them.
Similarly, words alone can’t do justice to the game changing potential of the new APEX Cloud Platform announced at DTW. That’s why I created a demo video giving customers an early preview[1] of the new management and orchestration capabilities coming to our APEX Cloud Platform Foundation Software. This software integrates intelligent automation into the familiar management tools of each supported cloud ecosystem – Microsoft Azure, Red Hat OpenShift, and VMware vSphere.
In this blog, I want to showcase APEX Cloud Platform for Microsoft Azure and the features and functionality we integrate into Microsoft Windows Admin Center. My colleague and friend, Kenny Lowe, wrote a brilliant analysis of our new solution in his recent blog post, Delving Into the APEX Cloud Platform for Microsoft Azure. He included some screen shots from my demo video, which hasn’t been shared publicly until now. I highly recommend reading his enlightening article, which provides invaluable context before viewing the demos.
Please be aware that the clips below are sections of a lengthier video that shares the story of a fictional retail company named WhyGoBuy. They used APEX Cloud Platform Foundation Software to accelerate their time to value and improve operational efficiency. Because this video was over 15 minutes long, I divided it into bite-sized chunks and included a brief introduction to each administrative task. You can view the full video HERE.
Without further ado, let’s dive into the technology!
At initial release of APEX Cloud Platform for Microsoft Azure, Dell Technologies is offering a white-glove deployment experience through Dell ProDeploy Services. Our expert technicians will walk you through your first deployments to help you get comfortable with the process. Soon after announcing general availability, we will empower you to install the platform yourself using the APEX Cloud Platform Foundation Software deployment automation. In this first video, our administrators at WhyGoBuy followed the step-by-step user input configuration method and provided the settings in each step of the deployment wizard.
The next video presents a common Day 2 operations scenario. Some of WhyGoBuy’s Storage Spaces Direct volumes were approaching maximum capacity, and one volume required immediate attention. Luckily, APEX Cloud Platform for Microsoft Azure offered a consistent hybrid management experience. Administrators were promptly made aware of the issue through Azure Monitor, which provided observability for their entire fleet of platforms across data center and edge locations. Then, they navigated to the Windows Admin Center extension for further investigation and remediation of the issue.
Lifecycle management is critical to ensuring the optimal security, performance, and reliability of any infrastructure. With APEX Cloud Platform Foundation Software, Dell helps our customers remain in a continuously validated state – updating the platform from one known good state to the next, inclusive of hardware, operating system, and systems management software. A few months passed since WhyGoBuy deployed their first platform, and the time came to apply a quarterly baseline bundle using the Windows Admin Center extension. The following video captures their experience.
WhyGoBuy was committed to maintaining a robust security posture. They used APEX Cloud Platform Foundation Software intrinsic infrastructure security management features to help them accomplish this. The next video showcases two of these features:
In this final video, WhyGoBuy set up connectivity to Dell ProSupport to benefit from log collection, phone home, automated case creation, and remote support. They also wanted to send telemetry data to Dell CloudIQ cloud-based software for multi-cluster monitoring. CloudIQ provided proactive monitoring, machine learning, and predictive analytics so they could take quick action and simplify operations of all their on-premises APEX Cloud Platforms.
We are excited to bring Dell APEX Cloud Platform for Microsoft Azure to market later this year. I’ve compiled the following list of available resources for further learning.
After we launch this solution, you’ll be able to find white papers, videos, blogs, and more at the APEX tile at our Info Hub site.
And as always, please reach out to your Dell account team if you would like to have more in-depth discussions about the APEX portfolio. If you don’t currently have a Dell contact, we’re here to help on our corporate website.
Author: Michael Lamia, Engineering Technologist at Dell Technologies
Follow me on Twitter: @Evolving_Techie
LinkedIn: https://www.linkedin.com/in/michaellamia/
Email: michael.lamia@dell.com
[1] Dell APEX Cloud Platform for Microsoft Azure will be generally available later in 2023. Some of the features and functionality depicted in these videos may behave differently at initial release or may not be available until later releases. Dell makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by Dell as an accommodation to the recipient solely for the purposes of discussion and without intending to be bound thereby.
Tue, 26 Sep 2023 14:22:06 -0000
|Read Time: 0 minutes
Today, Dell Technologies announced the general availability of Dell APEX Cloud Platform for Microsoft Azure. This on-premises, turnkey infrastructure platform is collaboratively engineered with Microsoft to optimize the Azure hybrid cloud experience.
It is the first offer in Premier Solutions for Microsoft Azure Stack HCI, a new category in the Azure Stack HCI catalog reserved for key partners with the greatest levels of engagement with Microsoft and deepest integrations into familiar Microsoft management tools.
Dell APEX Cloud Platform for Microsoft Azure comes bundled with fully automated management and orchestration, delivered by Dell APEX Cloud Platform Foundation Software. This software runs in a virtual appliance on each cluster and functions as the brains of the solution stack. The Cloud Platform Manager VM communicates with the underlying infrastructure and injects automation workflows into Microsoft Windows Admin Center via the Dell APEX Cloud Platform extension, as depicted in the following diagram.
Features that deliver breakthrough operational efficiency from Day 1 through Day 2/N include:
In this blog, we will focus on one of the most compelling and highly anticipated features of Dell APEX Cloud Platform Foundation Software – next generation full stack lifecycle management (LCM).
Our latest approach to LCM keeps Dell APEX Cloud Platform for Microsoft Azure operating in a Continuously Validated State (CVS) – advancing from one Known Good State (KGS) to the next inclusive of hardware, operating system, and systems management software. We have dramatically accelerated time to value with our latest approach to LCM, providing near instantaneous availability of new Microsoft updates within just four hours of being released.
The following graphic depicts the journey of an update from development to installation.
Dell Technologies is no stranger to efficiently applying updates to Azure Stack HCI clusters, having done so using a fully automated, cluster-aware approach with no impact to running workloads since 2019.
We first introduced this automation in our Dell OpenManage Integration with Microsoft Windows Admin Center v1.1. At that time, we provided the ability to generate a compliance report within our standalone extension that compared the currently running BIOS, firmware, and driver versions with an engineering-validated solution baseline catalog. Simply choose between targeting an online catalog or creating an offline catalog using Dell Repository Manager, and then our standalone extension would orchestrate the updates using Cluster-Aware Updating.
Version 2.0 of our OpenManage Integration extension went a step further to deliver our first foray into full stack cluster-aware updating through a snap-in developed for Microsoft’s Updates extension.
Using this snap-in, Azure Stack HCI operating system updates and Dell hardware updates (i.e., BIOS, firmware, and drivers) were applied using a single, consolidated workflow. This workflow only required one reboot per cluster node and was completely non-disruptive to running workloads. Once again, IT administrators could view a compliance report and select an online or DRM-created offline catalog for the Dell updates.
We’ve developed an entirely new Windows Admin Center extension with integrated Dell APEX Cloud Platform Foundation Software workflow automation. We continue to build on the pedigree we’ve established over the last four years with our OpenManage Integration extension, improving further by now incorporating proven and market-leading intellectual property (IP) from our other hyperconverged infrastructure (HCI) and software defined storage (SDS) offerings. Some of this innovative IP is derived from our highly successful VxRail HCI System software and results in a new standard for lifecycle management in a turnkey infrastructure platform.
When freshly deployed, Dell APEX Cloud Platform for Microsoft Azure runs at peak performance and resiliency to support your current workloads. The platform also comes secure by default with the following protection:
This pristine operating environment is known as the platform’s current Known Good State (KGS). Rest assured that the entire platform is running in a condition that is collaboratively validated by Dell and Microsoft engineering. To maintain the robust default security posture and optimal performance and resiliency, you need to keep the platform in a Continuously Validated State (CVS). Comprehensively advancing the end-to-end platform from one KGS to the next is accomplished with zero interruption to running workloads. The following graphic shows an example of a quarterly update that includes multiple software and hardware update components.
(Note: The release versions in this graphic are examples only and do not align with any official Dell APEX Cloud Platform for Microsoft Azure planned releases.)
The following table summarizes the different platform components that must be routinely updated to be compliant with the current or target KGS.
Component | Provider | Description | Example versioning |
Azure Stack HCI Solution | Microsoft | This contains OS quality and security updates, feature updates, emergency patches, and the Azure Stack HCI supplemental package | 10.2306.1.11 |
Dell APEX Cloud Platform Foundation Software | Dell Technologies | All software and services running inside the Cloud Platform Manager virtual machine | 01.00.00.00 |
Solution Builder Extension (SBE) | Dell Technologies | Package that contains all hardware updates for BIOS, iDRAC, firmware and drivers | 4.0.2307.1 |
The Azure Stack HCI Solution component follows the Modern Lifecycle policy, which defines the products and services that are continuously serviced and supported. To keep your Azure Stack HCI service in a supported state, you have up to six months to install updates. Dell and Microsoft recommend installing all updates as they are released to capitalize on the rapid pace of innovation and inclusion of new features. To learn more, see Azure Stack HCI release information.
Dell and Microsoft release the following types of updates for this platform:
Update type | Description | Typical cadence |
Baseline updates | Baseline updates include new features and improvements. They typically require host system reboots and might take longer. | Quarterly |
Patch Updates | Patch updates primarily contain quality and reliability improvements. They might include OS LCUs or hot patches. Some patches require host system reboots, while others don't. To fix critical or security issues, patches might be released sooner than monthly. | Monthly |
Hotfix | Hotfixes address blocking issues that could prevent regular patch or baseline updates. | On-demand |
Microsoft Azure and Dell update sites are periodically queried to discover applicable updates. These updates are listed in the Updates tab in the Dell APEX Cloud Platform extension in Windows Admin Center.
All updates – even emergency patches from Microsoft that address critical security vulnerabilities or bug fixes – will appear in the Dell extension within just four hours of being released. This near immediate availability of patches is unprecedented in a turnkey, on-premises infrastructure platform. And whether the updates are from Microsoft, Dell, or both organizations, you’ll never need to navigate away from the Dell APEX Cloud Platform extension interface to apply them.
In the past, Dell validated hardware updates and Microsoft validated operating system updates independently. With our enhanced lifecycle management approach, every update discovered by Dell APEX Cloud Platform Foundation Software has been jointly tested and validated by Dell and Microsoft. We incorporate new periodic builds of hardware, OS, and systems management components into our respective validation CI/CD pipelines. This raises the bar to an entirely new level of confidence and peace-of-mind for IT administrators.
In the relentless pursuit of delivering worry-free updates, the full stack lifecycle management workflow performs extensive prechecks before any update operations are initiated. For example, all platform components are checked to ensure they comply with the current KGS. If Dell Infrastructure Lock is enabled on the platform, a dialog box informs you that it will be temporarily disabled to allow updates and re-enabled after the update workflow is complete to maintain a strong security posture.
The entire update process relies heavily on Azure Stack HCI’s Lifecycle Manager feature, which employs Cluster-Aware Updating (CAU) to ensure no workloads are interrupted. One cluster node is placed into maintenance mode at a time, which triggers the Live Migration of VMs. CAU installs the updates, restarts the node if required, returns the node to a fully functional state, and proceeds to the next node in the cluster. When the LCM workflow is complete, a new compliance check is triggered to confirm that the platform has fully transitioned to the new target KGS.
The best way to summarize all the incredible benefits I’ve discussed about our evolved LCM approach is with a demo. Experience for yourself how stress-free LCM can be in this short video vignette.
We have tons of great content to help you deep-dive into Dell APEX Cloud Platform for Microsoft Azure powered by Dell APEX Cloud Platform Foundation Software.
And as always, please reach out to your Dell account team if you would like to have more in-depth discussions about the Dell APEX Cloud Platforms family. If you don’t currently have a Dell contact, we’re here to help on our corporate website.
Author: Michael Lamia, Engineering Technologist at Dell Technologies
Follow me on Twitter: @Evolving_Techie
LinkedIn: https://www.linkedin.com/in/michaellamia/
Email: michael.lamia@dell.com
Mon, 24 Jul 2023 15:06:01 -0000
|Read Time: 0 minutes
Welcome to Part 2 of a three-part blog series on running containerized applications on Microsoft Azure’s hybrid ecosystem. Part 1 provided the conceptual foundation and necessary context for the hands-on efforts documented in parts 2 and 3. This article details the testing we performed with AKS engine and Azure Monitor for containers in the Dell Technologies labs.
Here are the links to all the series articles:
Part 1: Running containerized applications on Microsoft Azure’s hybrid ecosystem – Provides an overview of the concepts covered in the blog series.
Part 2: Deploy K8s Cluster into an Azure Stack Hub user subscription – Setup an AKS engine client VM, deploy a cluster using AKS engine, and onboard the cluster to Azure Monitor for containers.
Part 3: Deploy a self-hosted Docker Container Registry – Use one of the Azure Stack Hub QuickStart templates to setup container registry and push images to this registry. Then, pull these images from the registry into the K8s cluster deployed with AKS engine in Part 2.
Introduction to the test lab
Before proceeding with the results of the testing with AKS engine and Azure Monitor for containers, we’d like to begin by providing a tour of the lab we used for testing all the container-related capabilities on Azure Stack Hub. Please refer to the following two tables. The first table lists the Dell EMC for Microsoft Azure Stack hardware and software. The second table details the various resource groups we created to logically organize the components in the architecture. The resource groups were provisioned into a single user subscription.
Scale Unit Information | Value |
Number of scale unit nodes | 4 |
Total memory capacity | 1 TB |
Total storage capacity | 80 TB |
Logical cores | 224 |
Azure Stack Hub version | 1.1908.8.41 |
Connectivity mode | Connected to Azure |
Identity store | Azure Active Directory |
Resource Group | Purpose |
demoaksengine-rg | Contains the resources associated with the client VM running AKS engine. |
demoK8scluster-rg | Contains the cluster artifacts deployed by AKS engine. |
demoregistryinfra-rg | Contains the storage account and key vault supporting the self-hosted Docker Container Registry. |
demoregistrycompute-rg | Contains the VM running Docker Swarm and the self-hosted container registry containers and the other supporting resources for networking and storage. |
kubecodecamp-rg | Contains the VM and other resources used for building the Sentiment Analysis application that was instantiated on the K8s cluster. |
Please also refer to the following graphic for a high-level overview of the architecture.
Prerequisites
All the step-by-step instructions and sample artifacts used to deploy AKS engine can be found in the Azure Stack Hub User Documentation and GitHub. We do not intend to repeat what has already been written – unless we want to specifically highlight a key concept. Instead, we will share lessons we learned while following along in the documentation.
We decided to test the supported scenario whereby AKS engine deploys all cluster artifacts into a new resource group rather than deploying the cluster to an existing VNET. We also chose to deploy a Linux-based cluster using the kubernetes-azurestack.json API model file found in the AKS engine GitHub repository. A Windows-based K8s cluster cannot be deployed by AKS engine to Azure Stack Hub at the time of this writing (November 2019). Do not attempt to use the kubernetes-windows.json file in the GitHub repository, as this will not be fully functional.
Addressing the prerequisites for AKS engine was very straight forward:
At this point, we built a Linux client VM for running the AKS engine command-line tool used to deploy and manage the K8s cluster. Here are the specifications of the VM provisioned:
Our Azure Stack Hub runs with certificates generated from an internal standalone CA in the lab. This means we needed to import our CA’s root certificate into the client VM’s OS certificate store so it could properly connect to the Azure Stack Hub management endpoints before going any further. We thought we would share the steps to import the certificate:
Note: We learned that if this procedure to import a CA’s root certificate is ever carried out on a server already running Docker, you have to stop and re-start Docker at this point. This is done on Ubuntu via the following commands:
sudo systemctl stop docker
sudo systemctl start docker
We then SSH’d into the client VM. While in the home directory, we executed the prescribed command in the documentation to download the get-akse.sh AKS engine installation script.
curl -o get-akse.sh https://raw.githubusercontent.com/Azure/aks-engine/master/scripts/get-akse.sh
chmod 700 get-akse.sh
./get-akse.sh --version v0.43.0
Once installed, we issued the aks-engine version command to verify a successful installation of AKS engine.
Deploy K8s cluster
There was one more step that needed to be taken before issuing the command to deploy the K8s cluster. We needed to customize a cluster specification using an example API Model file. Since we were deploying a Linux-based Kubernetes cluster, we downloaded the kubernetes-azurestack.json file into the home directory of the client VM. Though we could have used nano on the Linux VM to customize the file, we decided to use WinSCP to copy this file over to the management workstation so we could use VS Code to modify it instead. Here are a few notes on this:
While still in the home directory of the client VM, we issued the command to deploy the K8s cluster. Here is the command that was executed. Remember that the client ID and client secret are associated with the SPN, and the subscription ID is that of the Azure Stack Hub user subscription.
aks-engine deploy \
--azure-env AzureStackCloud \
--location Rack04 \
--resource-group demoK8scluster-rg \
--api-model ./kubernetes-azurestack.json \
--output-directory demoK8scluster-rg \
--client-id xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx \
--client-secret xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx \
--subscription-id xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
Once the deployment was complete and verified by deploying mysql with Helm, we copied the apimodel.json file found in the home directory of the client VM under a subdirectory with the name of the cluster’s resource group – in this case demoK8scluster-rg – to a safe location outside of the Azure Stack Hub scale unit. This file was used as input in all of the other AKS engine operations. The generated apimodel.json contains the service principal, secret, and SSH public key used in the input API Model. It also has all the other metadata needed by the AKS engine to perform all other operations. If this gets lost, AKS engine won't be able configure the cluster down the road.
Onboarding the new K8s cluster to Azure Monitor for containers
Before introducing our first microservices-based application to the K8s cluster, we preferred to onboard the cluster to Azure Monitor for containers. Azure Monitor for containers not only provides a rich monitoring experience for AKS in Public Azure but also for K8s clusters deployed in Azure Stack using AKS engine. We wanted to see what resources were being used only by the Kubernetes system functions before deploying any applications. The steps we performed in this section were performed on one of the K8s primary nodes using an SSH connection.
The prerequisites were fairly straight forward, but we did make a couple observations while stepping through them:
Note: Microsoft also supports the enabling of monitoring on this K8s cluster on Azure Stack Hub deployed with AKS engine using the API definition as an alternative to using the HELM chart. The API definition option doesn’t have a dependency on Tiller or any other component. Using this option, monitoring can be enabled during the cluster creation itself. The only manual step for this option would be to add the Container Insights solution to the Log Analytics workspace specified in the API definition.
We already had a Log Analytics Workspace at our disposal for this testing. We did make one mistake during the onboarding preparations, though. We intended to manually add the Container Insights solution to the workspace but installed the legacy Container Monitoring solution instead of Container Insights. To be on the safe side, we ran the onboarding_azuremonitor_for_containers.ps1 PowerShell script and supplied the values for our resources as parameters. The script skipped the creation of the workspace since we already had one and just installed the Container Insights solution via an ARM template in GitHub identified in the script.
At this point, we could simply issue the HELM commands under the Install the chart section of the documentation. Besides inserting the workspace ID and workspace key, we also replaced the --name parameter value with azuremonitor-containers. We did not observe anything out of the ordinary during the onboarding process. Once complete, we had to wait about 10 minutes before we could go to the Public Azure portal and see our cluster appear. We had to remember to click on the drop-down menu for Environment in the Container Insights section in Azure Monitor and select “Azure Stack (Preview)” for the cluster to appear.
We hope this blog post proves to be insightful to anyone deploying a K8s cluster on Azure Stack Hub using AKS engine. We also trust that the information provided will assist in the onboarding of that new cluster to Azure Monitor for containers. In Part 3, we will step through our experience deploying a self-hosted Docker Container Registry into Azure Stack Hub.
Thu, 13 Apr 2023 21:28:09 -0000
|Read Time: 0 minutes
A picture is worth a thousand words, but the value of a good hands-on lab is immeasurable!
Our newly minted interactive demo and hands-on lab are published in the Dell Technologies Demo Center:
In this blog, we'll begin with a brief introduction to these test drives. Then, we'll share our list of other virtual labs that will prove invaluable on your journey to becoming an Azure Stack HCI champion. Fasten your seatbelt and get ready to take your skills to the next level!
The interactive demo can be accessed directly by all customers and partners. When first navigating to the Demo Center site, remember to click the Sign In drop-down menu in the upper right corner of the page.
At the present time, you will not see the hands-on lab appear in the Demo Center catalog. You will need to contact your Dell Technologies account team to gain access to HOL-0313-01.
Taking this demo is like competing in a Formula1 or NASCAR race. It is fast-paced and remains within the secure confines of the track's guardrails. Each module in the demo guides you down a well-defined path that leads to a desired business outcome. Here is a summary of the benefits our OpenManage Integration extension delivers:
Whenever new features are released for our extension, you'll be able to familiarize yourself with them here first. In the latest release (v3.0), we completely revamped the user interface for improved usability and navigation. We also added server and cluster-level checks to ensure that all prerequisites are met for seamless enablement of management and monitoring operations. The following figure illustrates the results of a prerequisite check. In the interactive demo, you learn more about these failures and how to use the OpenManage Integration extension to fix them.
When we first start driving, we need our parents and teachers to provide turn-by-turn directions. If you're exploring the extension for the first time, you'll want to keep the guides enabled to aid your understanding.
For example, consider the CPU Core Management feature. This feature allows you to right size your Azure Stack HCI cluster by enabling/disabling CPU cores to meet the requirements of your workload profile. It can also help save in subscription costs because Azure Stack HCI hosts are billed by CPU core per month. The guide in the following figure reminds you that a thorough analysis of your workload characteristics is essential prior to reducing the enabled CPU cores on this cluster.
After you've familiarized yourself with the talk track, you can leave your parents and teachers at home and drive through the demos without the detailed explanations. You can navigate using links alone by clicking the X in the upper right-hand corner of any guide. You might choose to proceed down this road to test your knowledge. As a Dell Technologies partner, you might want to create the illusion of performing a demo from a live environment to impress prospective clients.
The Microsoft Azure Stack HCI Deployment hands-on lab in the Demo Center will appeal to the more mechanically inclined. It pops open the hood so you can get your hands dirty with all the PowerShell automation in our End-to-End Deployment Guide for Day 1 deployments. It is accompanied by an in-depth student manual to point you in the right direction, but there is a bit more freedom to go off-road compared with the interactive demo. Keep in mind that this is a virtual environment, so certain tasks that require the physical hardware may be limited.
This figure illustrates how you can drag and drop the PowerShell code into the console, so you aren't wasting time typing everything yourself:
We still show the GUI some love in the later portions of the lab. Failover Cluster Manager and Windows Admin Center make an appearance after you've used PowerShell to configure the hosts, create the cluster, configure a cluster witness, and enable Storage Spaces Direct (S2D). You'll be able to confirm the expected outcome at the command line using the graphical tools.
The following figure shows the step where you use Failover Cluster Manager to inspect the newly created storage pool after its created with PowerShell.
You'll also explore some of the management and monitoring capabilities in Windows Admin Center after adding your new cluster as a connection. This section of the HOL stops short of exploring the OpenManage Integration extension, though. We provide a link in the student manual to the interactive demo. If you’re not a fan of the layout of the lab shown in the following figure, you can rearrange the panes to fit your preferences. For example, you can open the manual in a separate window and allow your virtual desktop to consume all your screen real estate.
Maybe the interactive demo and hands-on lab don't meet your needs. Maybe you're looking to kick the tires on Azure Stack HCI without any training wheels. In that case, there are other options available to you. We have compiled a great list of resources that address a variety of use cases:
If you're looking for educational materials on Azure Stack HCI, like white papers, blogs, and videos, visit our Info Hub and main product page.
Be sure to follow me on Twitter @Evolving_Techie and LinkedIn.
Author: Mike Lamia
Tue, 01 Mar 2022 20:39:03 -0000
|Read Time: 0 minutes
Who doesn’t enjoy VIP treatment? Exciting opportunities to feel like royalty include winning box seats at a sporting event or getting invited to attend opening night at a new restaurant. I received an unexpected upgrade to business class on a flight a couple years ago and remember texting every celebratory meme I could find to friends and family! These are the moments in life to really savor.
In my line of work as a technical marketing engineer, I relish any situation where VIP stands for Very Important Person rather than Virtual IP address. Private previews of the latest technology often provide both flavors of VIP.
I consider myself fortunate to be among the first to experience cutting-edge solutions with the potential to solve today’s most vexing business challenges. I also get direct access to the best minds in the software and hardware industry. They welcome my feedback, and there’s no better feeling than knowing that I’ve made a meaningful contribution to a product that will benefit the broader community! Now it’s your turn to feel the thrill of gaining early access to long-awaited new software capabilities for Azure Stack HCI.
Your official preview invitation has arrived.
You are cordially invited to participate in an exclusive VIP preview of Azure Stack HCI Configuration and Policy Compliance Visibility from Dell Technologies, integrated with Azure Arc.
The Azure Arc portfolio demonstrates the unique Microsoft approach to delivering hybrid cloud by extending Azure platform services and management capabilities to data center, edge, and multi-cloud environments. Dell Technologies uses the Azure Policy guest configuration feature and Azure Arc-enabled servers to audit software and hardware settings in Dell Integrated System for Microsoft Azure Stack HCI.
Our engineering-validated integrated system is Azure hybrid by design and delivers efficient operations using our Dell OpenManage Integration with Microsoft Windows Admin Center extension and snap-ins.
When we first developed our extension, we delivered deep hardware monitoring, inventory, and troubleshooting capabilities. Over the last few years, we have collected valuable feedback from preview programs to drive further investment and innovation into our extension. Customer experience has helped us shape new features including:
The Azure Arc integration from Dell Technologies complements Windows Admin Center and our OpenManage extension by applying robust governance services to the integrated system. Our Azure Arc integration creates software and hardware compliance policies for near real-time detection of infrastructure configuration drift at-scale. It protects clusters in the data center or geographically dispersed to ROBO and edge locations from malicious threats and inadvertent changes to operating system, BIOS, iDRAC, and network adapter settings on AX nodes from Dell Technologies. Without this visibility, you leave yourself vulnerable to security breaches, costly downtime, and degraded application performance.
All we need now is your experience and valuable feedback to help us fine-tune this critical capability!
Intentionally selected AX node attributes and values targeted by our Azure Arc integration are routinely checked for compliance against pre-defined business rules. Then, compliance results are visualized in the Policy blade of the Azure portal as shown in the following screen shots.
This guided preview is checking select OS-level, cluster-level, BIOS, iDRAC, and network adapter attributes that optimize Azure Stack HCI. If an unapproved change to these attribute values goes undetected, the integrated system may experience degradation to performance, availability, and security. The abnormal behavior of the system may not be readily traced back to the modified OS and hardware setting – delaying Mean Time to Repair (MTTR). The longer the incident takes to resolve, the greater the consequences to your business in the form of decreased productivity, lost revenue, or tarnished reputation.
Here are just some of the preview benefits in store:
Availability is limited for this guided preview. To claim your spot, please contact your account manager right away. They will coordinate with the internal teams at Dell Technologies and schedule further conversations with you. A professional services engagement is required to install the Azure Arc integration during the preview phase. We will work together to prepare the Azure artifacts and run the required scripts. Over time, Dell Technologies intends to expand this compliance visibility to a much larger set of attributes in an extensible, user-friendly framework.
I hope you’re as excited as I am to deliver this configuration and policy compliance visibility using Azure Arc to Dell Integrated System for Microsoft Azure Stack HCI. The technical previews that I’ve been a part of have been some of the most memorable and rewarding experiences of my career. An unexpected upgrade to business class is nice but contributing to the success of a technology that will help my industry peers for years to come? Priceless.
Author: Michael Lamia
Twitter: @Evolving_Techie
LinkedIn: https://www.linkedin.com/in/michaellamia/
Wed, 20 Oct 2021 19:59:25 -0000
|Read Time: 0 minutes
Like any good techie, I can get a little obsessed with gadgets that improve my quality of life. Take, for example, my recent discovery of wearable technology that eases the symptoms of motion sickness. For most of my life, I’ve had to take over-the-counter or prescription medicine when boating, flying, and going on road trips. Then, I stumbled across a device that I could wear around my wrist that promised to solve the problem without the side effects. Hesitantly, I bought the device and asked a friend to drive like a maniac around town while I sat in the back seat. It actually worked – no headache, no nausea, and no grogginess from meds! Needless to say, I never leave home without my trusty gizmo to keep motion sickness at bay.
Throughout my career in managing IT infrastructure, stress has affected my quality of life almost as much as motion sickness. There is one responsibility that has always caused more angst than anything else: lifecycle management (LCM). To narrow that down a bit, I’m specifically talking about patching and updating IT systems under my control. I have sometimes been derelict in my duties because of annoying manual steps that distract me from working on the fun, highly visible projects. It’s these manual steps that can cause the dreaded DU/DL (data unavailable or data loss) to rear its ugly head. Can you say insomnia?
Innovative technology to the rescue once again! While creating a demo video last year for our Dell EMC OpenManage Integration with Microsoft Windows Admin Center (OMIMSWAC), I was blown away by how easy we made the BIOS, firmware, and driver updates on clusters. The video did a pretty good job of showing the power of the Cluster-Aware Updating (CAU) feature, but it didn’t go far enough. I needed to quantify its full potential to change an IT profressional’s life by pitting an OMIMSWAC’s automated, CAU approach against a manual, node-based approach. I captured the results of the bake off in Dell EMC HCI Solutions for Microsoft Windows Server: Lifecycle Management Approach Comparison.
For this white paper to really stand the test of time, I knew I needed to be very clever to compare apples-to-apples. First, I referred to HCI Operations Guide—Managing and Monitoring the Solution Infrastructure Life Cycle, which detailed the hardware updating procedures for both the CAU and node-based approaches. Then, I built a 4-node Dell EMC HCI Solutions for Windows Server 2019 cluster, performed both update scenarios, and recorded the task durations. We all know that automation is king, but I didn’t expect the final tally to be quite this good:
As you can see from the following charts taken from the paper, these numbers only improved as I extrapolated them out to the maximum Windows Server HCI cluster size of 16 nodes.
I thought these results were too good to be true, so I checked my steps about 10 times. In fact, I even debated with my Marketing and Product Management counterparts about sharing these claims with the public! I could hear our customers saying, “Oh, yeah, right! These are just marketecture hero numbers.” But in this case, I collected the hard data myself. I am still confident that these results will stand up to any scrutiny. This is reality – not dreamland!
So why am I blogging about a project I did last year? Just when I thought the testing results in the white paper couldn’t possibly get any better, Dell EMC Integrated System for Microsoft Azure Stack HCI came along. Azure Stack HCI is Microsoft’s purpose-built operating system delivered as an Azure service. The current release when writing this blog was Azure Stack HCI, version 20H2. Our Solution Brief provides a great overview of our all-in-one validated HCI system, which delivers efficient operations, flexible consumption models, and end-to-end enterprise support and services. But what I’m most excited about are two lifecycle management enhancements – 1-click full stack LCM and Kernel Soft Reboot – that will put an end to the old adage, “If it looks too good to be true, it probably is.”
OMIMSWAC was at version 1.1 when I did my testing last year. In that version, the CAU feature focused on the hardware – BIOS, firmware, and drivers. In OMIMSWAC v2.0, we developed an exclusive snap-in to Microsoft’s Failover Cluster Tool Extension to create 1-click full stack LCM. Only available for clusters running Azure Stack HCI, a simple workflow in Windows Admin Center automates not only the hardware updates – but also the operating system updates. How do I see this feature lowering my blood pressure?
The following screen shots were taken from the full stack CAU workflow. The first step indicates which OS updates are available for the cluster nodes.
Node validation is performed first before moving forward with hardware updates.
If the Windows Admin Center host is connected to the Internet, the online update source approach obtains all the systems management utilities and the engineering validated solution catalog automatically. If operating in an edge or disconnected environment, the solution catalog can be created with Dell EMC Repository Manager and placed on a file server share accessible from the cluster nodes.
The following image shows a generated compliance report. All non-compliant components are selected by default for updating. After this point, all the OS and non-compliant hardware components will be updated together with only a single reboot per node in the cluster and with no impact to running workloads.
Speaking of reboots, Kernel Soft Reboot (KSR) is a new feature coming in Azure Stack HCI, version 21H2 that also has the potential to make my white paper claims even more jaw dropping. KSR will give me the ability to perform a “software-only restart” on my servers – sparing me from watching the paint dry during those long physical server reboots. Initially, the types of updates in scope will be OS quality and security hotfixes since these don’t require BIOS/firmware initialization. Dell Technologies is also working on leveraging KSR for the infrastructure updates in a future release of OMIMSWAC.
KSR will be especially beneficial when using Microsoft’s CAU extension in Windows Admin Center. The overall time savings using KSR multiplies for clusters because faster restarts means less resyncing of data after CAU resumes each cluster node. Each node should reboot with Mach Speed if there are only Azure Stack HCI OS hotfixes and Dell EMC Integrated System infrastructure updates that do not require the full reboot. I will definitely be hounding my Product Managers and Engineering team to deliver KSR for infrastructure updates in our OMIMSWAC extension ASAP.
I decided to hold off on doing a new bakeoff until Azure Stack HCI, version 21H2 is released with KSR. I also want to wait until we bring the benefits of KSR to OMIMSWAC for infrastructure updates. The combination of OMIMSWAC 1-click full stack CAU and KSR will continue to make OMIMSWAC unbeatable for seamless lifecycle management. This means better outcomes for our organizations, improved blood pressure and quality of life for IT pros, and more motion-sickness-free adventure vacations. I’m also looking forward to spending more time learning exciting new technologies and less time with routine administrative tasks.
If you’d like to get hands-on with all the different features in OMIMSWAC, check out the Interactive Demo in Dell Technologies Demo Center. Also, check out my other white papers, blogs, and videos in the Dell Technologies Info Hub.
Thu, 22 Jul 2021 10:35:43 -0000
|Read Time: 0 minutes
Introduction
Welcome to Part 3 of a three-part blog series on running containerized applications on Microsoft Azure’s hybrid ecosystem. In this part, we step through how we deployed a self-hosted, open-source Docker Registry v2 to an Azure Stack Hub user subscription. We also discuss how we pushed a microservices-based application to the self-hosted registry and then pulled those images from that registry onto the K8s cluster we deployed using AKS engine.
Here are the links to all the series articles:
Part 1: Running containerized applications on Microsoft Azure’s hybrid ecosystem – Provides an overview of the concepts covered in the blog series.
Part 2: Deploy K8s Cluster into an Azure Stack Hub user subscription – Setup an AKS engine client VM, deploy a cluster using AKS engine, and onboard the cluster to Azure Monitor for containers.
Part 3: Deploy a self-hosted Docker Container Registry – Use one of the Azure Stack Hub QuickStart templates to setup container registry and push images to this registry. Then, pull these images from the registry into the K8s cluster deployed with AKS engine in Part 2.
Here is a diagram depicting the high-level architecture in the lab for review.
There are a few reasons why an organization may want to deploy and maintain a container registry on-premises rather than leveraging a publicly accessible registry like Docker Hub. This approach may be particularly appealing to customers in air-gapped deployments where there is unreliable connectivity to Public Azure or no connectivity at all. This can work well with K8s clusters that have been deployed using AKS engine at military installations or other such locales. There may also be regulatory compliance, data sovereignty, or data gravity issues that can be addressed with a local container registry. Some customers may simply require tighter control over where their images are being stored or need to integrate image storage and distribution tightly into their in-house development workflows.
Prerequisites
The deployment of Docker Registry v2 into an Azure Stack Hub user subscription was performed using the 101-vm-linux-docker-registry Azure Stack Hub QuickStart template. There are two phases to this deployment:
1. Creation of a storage account and key vault in a discrete resource group by running the setup.ps1 PowerShell script provided in the template repo.
2. The setup.ps1 script also created an azuredeploy.parameters.json file that is used in the PowerShell command to deploy the ARM template that describes the container registry VM and its associated resources. These resources get provisioned into a separate resource group.
Be aware that there is another experimental repository in GitHub called msazurestackworkloads for deploying Docker Registry. The main difference is that the code in the msazurestackworkloads repo includes artifacts for creating a marketplace gallery item that an Azure Stack Hub user could deploy into their subscription. Again, this is currently experimental and is not supported for deploying the Docker Registry.
One of the prerequisite steps was to obtain an X.509 SSL certificate in PFX format for loading into key vault. We pointed to its location during the running of the setup.ps1 script. We used our lab’s internal standalone CA to create the certificate, which is the same CA used for deploying the K8s cluster with AKS engine. We thought we’d share the steps we took to obtain this certificate in case any readers aren’t familiar with the process. All these steps must be completed from the same workstation to ensure access to the private key.
We performed these steps on the management workstation:
1. Created an INF file that looked like the following. The subject’s CN is the DNS name we provided for the registry VM’s public IP address.
[Version]
Signature="$Windows NT$"
[NewRequest]
Subject = "CN=cseregistry.rack04.cloudapp.azs.mhclabs.com,O=CSE Lab,L=Round Rock,S=Texas,C=US"
Exportable = TRUE
KeyLength = 2048
KeySpec = 1
KeyUsage = 0xA0
MachineKeySet = True
ProviderName = "Microsoft RSA SChannel Cryptographic Provider"
HashAlgorithm = SHA256
RequestType = PKCS10
[Strings]
szOID_SUBJECT_ALT_NAME2 = "2.5.29.17"
szOID_ENHANCED_KEY_USAGE = "2.5.29.37"
szOID_PKIX_KP_SERVER_AUTH = "1.3.6.1.5.5.7.3.1"
szOID_PKIX_KP_CLIENT_AUTH = "1.3.6.1.5.5.7.3.2"
[Extensions]
%szOID_SUBJECT_ALT_NAME2% = "{text}dns=cseregistry.rack04.cloudapp.azs.mhclabs.com"
%szOID_ENHANCED_KEY_USAGE% = "{text}%szOID_PKIX_KP_SERVER_AUTH%,%szOID_PKIX_KP_CLIENT_AUTH%"
[RequestAttributes]
2. We used the certreq.exe command to generate a CSR that I then submitted to the CA.
certreq.exe -new cseregistry_req.inf cseregistry_csr.req
3. We received a .cer file back from the standalone CA and followed the instructions here to convert this .cer file to a .pfx file for use with the container registry.
Prepare Azure Stack PKI certificates for deployment or rotation
We also needed to have CA’s root certificate in .crt file format. We originally obtained this during the K8s cluster deployment using AKS engine. This needs to be imported into the certificate store of any device that intends to interact with the container registry.
Deploy the container registry supporting infrastructure
We used the setup.ps1 PowerShell script included in the QuickStart template’s GitHub repo for creating the supporting infrastructure for the container registry. We named the new resource group created by this script demoregistryinfra-rg. This resource group contains a storage account and key vault. The registry is configured to use the Azure storage driver to persist the container images in the storage account blob container. The key vault stores the credentials required to authenticate to the registry as a secret and secures the certificate. A service principal (SPN) created prior to executing the setup.ps1 script is leveraged to access the storage account and key vault.
Here are the values we used for the variables in our setup.ps1 script for reference (sensitive information removed, of course). Notice that the $dnsLabelName value is only the hostname of the container registry server and not the fully qualified domain name.
$location = "rack04"
$resourceGroup = "registryinfra-rg"
$saName = "registrystorage"
$saContainer = "images"
$kvName = "registrykv"
$pfxSecret = "registrypfxsecret"
$pfxPath = "D:\file_system_path\cseregistry.pfx"
$pfxPass = "mypassword"
$spnName = "Application (client) ID"
$spnSecret = "Secret"
$userName = "cseadmin"
$userPass = "!!123abc"
$dnsLabelName = "cseregistry"
$sshKey = "SSH key generated by PuttyGen"
$vmSize = "Standard_F8s_v2"
$registryTag = "2.7.1"
$registryReplicas = "5"
Deploy the container registry using the QuickStart template
We deployed the QuickStart template using a similar PowerShell command as the one indicated in the README.md of the GitHub repo. Again, the azuredeploy.parameters.json file was created automatically by the setup.ps1 script. This was very straightforward. The only thing to mention is that we created a new resource group when deploying the QuickStart template. We could have also selected an existing resource group that did not contain any resources.
Testing the Docker Registry with the Sentiment Analyzer application
At this point, it was time to test the K8s cluster and self-hosted container registry running on Azure Stack Hub from end-to-end. For this, we followed a brilliant blog article entitled Learn Kubernetes in Under 3 Hours: A Detailed Guide to Orchestrating Containers written by Rinor Maloku. This was a perfect introduction to the world of creating a microservices-based application running in multiple containers. It covers Docker and Kubernetes fundamentals and is an excellent primer for anyone just getting started in the world of containers and container orchestration. The name of the application is Sentiment Analyser, and it uses text analysis to ascertain the emotion of a sentence.
Learn Kubernetes in Under 3 Hours: A Detailed Guide to Orchestrating Containers
We won’t share all the notes we took while walking through the article. However, there are a couple tips we wanted to highlight as they pertain to testing the K8s cluster and new self-hosted Docker Registry in the lab:
sudo docker tag <image ID> cseregistry.rack04.cloudapp.azs.mhclabs.com/sentiment-analysis-frontend
sudo docker push cseregistry.rack04.cloudapp.azs.mhclabs.com/sentiment-analysis-frontend
https://cseregistry.rack04.cloudapp.azs.mhclabs.com/v2/_catalog
The following output was given after we pushed all the application images:
{"repositories":["sentiment-analysis-frontend","sentiment-analysis-logic","sentiment-analysis-web-app"]}
Pull an Image from a Private Registry
In the section of this article entitled Create a Secret by providing credentials on the command line, we discovered a couple items to note:
We’ve now gotten into the habit of verifying the secret after I create it by using the following command:
kubectl get secret regcred --output="jsonpath={.data.\.dockerconfigjson}" | base64 --decode
Here is the content of the file that worked, but this blog post interface will not be able to display the indentation correctly:
apiVersion: v1
kind: Pod # 1
metadata:
name: sa-frontend
labels:
app: sa-frontend # 2
spec: # 3
containers:
- image: cseregistry.rack04.cloudapp.azs.mhclabs.com/sentiment-analysis-frontend # 4
name: sa-frontend # 5
ports:
- containerPort: 80 # 6
imagePullSecrets:
- name: regcred
At this point, we were able to observe the fully functional Sentiment Analyser application running on our K8s cluster on Azure Stack Hub. We were not only running this application on-premises in a highly prescriptive, self-managed Kubernetes cluster, but we were also able to do so while leveraging a self-hosted Docker Registry for the transferring of the images. We could also proceed to Azure Monitor for containers using the Public Azure portal to monitor our running containerized application and create thresholds for timely alerting on any potential issues.
Articles in this blog series
We hope this blog post proves to be insightful to anyone deploying a self-hosted container registry on Azure Stack Hub. It has been a great learning experience stepping through the deployment of a containerized application using the Microsoft Azure toolset. There are so many other things we want to try like Deploying Azure Cognitive Services to Azure Stack Hub and using Azure Arc to run Azure data services on our K8s cluster on Azure Stack Hub. We look forward to sharing more of our testing results on these exciting capabilities in the near future.
Wed, 16 Jun 2021 13:35:49 -0000
|Read Time: 0 minutes
Small offices and remote branch office (ROBO) use cases present special challenges for IT organizations. The issues tend to revolve around how to implement a scalable, resilient, secure, and highly performant platform at an affordable TCO. The infrastructure must be capable enough to efficiently run a highly diversified portfolio of applications and services and yet be simple to deploy, update, and support by a local IT generalist. Dell Technologies and Microsoft help you accelerate business outcomes in these unique ROBO environments with our Dell EMC Solutions for Microsoft Azure Stack HCI.
In this blog post, we share VMFleet results observed in the Dell Technologies labs for our newest AX-6515 two-node configuration – ideal for ROBO environments. Optimized for value, the small but powerful AX-6515 node packs a dense, single-socket 2nd Gen AMD EPYC processor in a 1RU chassis delivering peak performance and excellent TCO. We also included the Dell EMC PowerSwitch S5212F-ON in our testing to provide 25GbE network connectivity for the storage, management, and VM traffic in a small form factor. The Dell EMC Solutions for Azure Stack HCI Deployment Guide was followed to construct the test lab and applies only to infrastructure that is built with validated and certified AX nodes running Microsoft Windows Server 2019 Datacenter from Dell Technologies.
We were quite impressed with the VMFleet results. First, we stressed the cluster’s storage subsystem to its limits using scenarios aimed at identifying maximum IOPS, latency, and throughput. Then, we adjusted the test parameters to be more representative of real-world workloads. The following summary of findings indicated to us that this two-node, AMD-based, all-flash cluster could meet or exceed the performance requirements of workload profiles often found in ROBO environments:
The following diagram illustrates the environment created in the Dell Technologies labs for the VMFleet testing. Ancillary services required for cluster operations such as DNS, Active Directory, and a file server for cluster quorum are not depicted.
Figure 1 Network topology
Table 1 Cluster configuration
Cluster Design Elements | Description |
Number of cluster nodes | 2 |
Cluster node model | AX-6515 nodes |
Number of network switches for RDMA and TCP/IP traffic | 2 |
Network switch model | Dell EMC PowerSwitch S5212F-ON |
Network topology | Fully-converged network configuration. RDMA and TCP/IP traffic traversing 2 x 25GbE network connections from each host. |
Network switch for OOB management | Dell EMC PowerSwitch S3048-ON |
Resiliency option | Two-way mirror |
Usable storage capacity | Approximately 12 TB |
Table 2 Cluster node resources
Resources per Cluster Node | Description |
CPU | Single-socket AMD EPYC 7702P 64-Core Processor |
Memory | 256 GB DDR4 RAM |
Storage controller for OS | BOSS-S1 adapter card |
Physical drives for OS | 2 x Intel 240 GB M.2 SATA drives configured as RAID 1 |
Storage controller for Storage Spaces Direct (S2D) | HBA330 Mini |
Physical drives | 8 x 1.92 TB Mixed Use KIOXIA SAS SSDs |
Network adapter | Mellanox ConnectX-5 Dual Port 10/25GbE SFP28 Adapter |
Operating System | Windows Server 2019 Datacenter |
The architectures of Azure Stack HCI solutions are highly opinionated and prescriptive. Each design is extensively tested and validated by Dell Technologies Engineering. Here is a summary of the key quality attributes that define these architectures followed by a section devoted to our performance findings.
We leveraged VMFleet to benchmark the storage subsystem of our 2-node cluster. Many Microsoft customers and partners rely on this tool to help them stress test their Azure Stack HCI clusters. VMFleet consists of a set of PowerShell scripts that deploy virtual machines to a Hyper-V cluster and execute Microsoft’s DiskSpd within those VMs to generate IO. The following table presents the range of VMFleet and DiskSpd parameters used during our testing in the Dell Technologies labs.
Table 3 Test parameters
VMFleet and DiskSpd Parameters | Values |
Number of VMs running per node | 20 |
vCPUs per VM | 2 |
Memory per VM | 8 GB |
VHDX size per VM | 40 GB |
VM Operating System | Windows Server 2019 |
0 | |
Block sizes (B) | 4k – 512k |
Thread count (T) | 2 |
Outstanding IOs (O) | 32 |
Write percentages (W) | 0, 20, 50, 100 |
IO patterns (P) | Random, Sequential |
We first selected DiskSpd scenarios aimed at identifying the maximum IOPS, latency, and throughput thresholds of the cluster. By pushing the limits of the storage subsystem, we confirmed that the networking, compute, operating systems, and virtualization layer were configured correctly according to our Deployment Guide and Network Integration and Host Network Configuration Options guide. This also ensured that that no misconfiguration occurred during initial deployment that could skew the real-world storage performance results. Our results are depicted in Table 4.
Table 4 Maximums test results
Scenario | Parameter Values Explained | Performance Metric |
B4-T2-O32-W0-PR | Block size: 4k Thread count: 2 Outstanding IO: 32 IO pattern: 100% random read | IOPS: 1,146,948 Read latency: 245 microseconds CPU utilization: 48% |
B4-T2-O32-W100-PR | Block size: 4k Thread count: 2 Outstanding IO: 32 IO pattern: 100% random write | IOPS: 417,591 Write latency: 4 milliseconds CPU utilization: 25% |
B512-T2-O2-W0-PSI | Block size: 512k Thread count: 2 Outstanding IO: 8 IO pattern: 100% sequential read | Throughput: 12 GB/s |
B512-T2-O2-W100-PSI | Block size: 512k Thread count: 2 Outstanding IO: 8 IO pattern: 100% sequential write | Throughput: 6 GB/s |
We then stressed the storage subsystem using IO patterns more reflective of the types of workloads found in a ROBO use case. These applications are typically characterized by smaller block sizes, random I/O patterns, and a variety of read/write ratios. Examples include general enterprise and small office LOB applications and OLTP workloads. The testing results in Figure 2 below indicate that the cluster has the potential to accelerate OLTP workloads and make enterprise applications highly responsive to end users.
Figure 2 Performance results with smaller block sizes
Other services like backups, streaming video, and large dataset scans have larger block sizes and sequential IO patterns. With these workloads, throughput becomes the key performance indicator to analyze. The results shown in the following graph indicate an impressive sustained throughput that can greatly benefit this category of IT services and applications.
Figure 3 Performance results with larger block sizes
Customers could make modifications to the lab configuration to accommodate different requirements in the ROBO use case. For example, Dell Technologies completely supports a dual-link full mesh topology for Azure Stack HCI. This non-converged storage switchless topology eliminates the need for network switches for storage communications and enables you to use existing infrastructure for management and VM traffic. This approach will result in similar or improved performance metrics versus those mentioned in this blog due to the 2 x 25 GB direct connections between the nodes and the isolation of the storage traffic on these dedicated connections.
Figure 4 Two-node back-to-back architecture option
There may be situations in ROBO scenarios where there are no IT generalists near the site to address hardware failures. When a drive or entire node fails, it may take days or weeks before someone can service the nodes and return the cluster to full functionality. Consider nested resiliency instead of two-way mirroring to handle multiple failures on a two-node cluster. Inspired by RAID 5 + 1 technology, workloads remain online and accessible even in the following circumstances:
Figure 5 Nested resiliency option
Be aware that there is a capacity efficiency cost when using nested resiliency. Two-way mirroring is 50% efficient, meaning 1 TB of data takes up 2 TB of physical storage capacity. Depending on the type of nested resiliency you choose to configure, capacity efficiency can range between 25% - 40%. Therefore, ensure you have an adequate amount of raw storage capacity if you intend to use this technology. Performance is also going to be affected when using nested resiliency – especially on workloads with a higher percentage of write IO since more copies of the data need to be maintained on the cluster.
If you need greater flexibility in cluster resources, Dell Technologies offers Azure Stack HCI configurations to meet any workload profile and business requirement. The table below shows the different resource options available for each AX node. To find more detailed specifications about these configurations, please review the detailed product specifications on our product page.
Table 5 Azure Stack HCI configuration options
Visit our website for more details on Dell EMC Solutions for Azure Stack HCI.
Wed, 16 Jun 2021 13:35:49 -0000
|Read Time: 0 minutes
Modern IT applications have a broad range of performance requirements. Some of the most demanding applications use Online Transactional Processing (OLTP) database technology. Typical organizations have many mission critical business services reliant on workloads powered by these databases. Examples of such services include online banking in the financial sector and online shopping in the retail sector. If the response time of these systems is slow, customers will likely suffer a poor user experience and may take their business to competitors. Dissatisfied customers may also express their frustration through social media outlets resulting in incalculable damage to a company’s reputation.
The challenge in maintaining an exceptional consumer experience is providing databases with performant infrastructure while also balancing capacity and cost. Traditionally, there have been few cost-effective options that cache database workloads, which would greatly improve end-user response times. Intel Optane persistent memory (Intel Optane PM) offers an innovative path to accelerating database workloads. Intel Optane PM performs almost as well as DRAM, and the data is preserved after a power cycle. We were interested in quantifying these claims in our labs with Dell EMC HCI Solutions for Microsoft Windows Server.
Windows Server HCI running Microsoft Windows Server 2019 provides industry-leading virtual machine performance with Microsoft Hyper-V and Microsoft Storage Spaces Direct technology. The platform supports Non-Volatile Memory Express (NVMe), Intel Optane PM, and Remote Direct Memory Access (RDMA) networking. Windows Server HCI is a fully productized, validated, and supported HCI solution that enables enterprises to modernize their infrastructure for improved application uptime and performance, simplified management and operations, and lower total cost of ownership. AX nodes from Dell EMC, powered by industry-leading PowerEdge server platforms, offer a high-performance, scalable, and secure foundation on which to build a software-defined infrastructure.
In our lab testing, we wanted to observe the impact on performance when Intel Optane PM was added as a caching tier to a Windows Server HCI cluster. We set up two clusters to compare. One cluster was configured as a two-tier storage subsystem with Intel Optane PM in the caching tier and SATA Read-Intensive SSDs in the capacity tier. We inserted 12 x 128 GB Intel Optane PM modules into this cluster for a total of 1.5 TB per node. The other cluster’s storage subsystem was configured as a single-tier of SATA Read-Intensive SSDs. With respect to CPU selection, memory, and Ethernet adapters, the two clusters were configured identically.
Only the Dell EMC AX-640 nodes currently accommodate Intel Optane PM. The clusters were configured as follows:
Cluster Resources | Without Intel Optane PM | With Intel Optane PM |
Number of nodes | 4 | 4 |
CPU | 2 x Intel 6248 CPU @ 2.50 GHz (3.90 GHz with TurboBoost) | 2 x Intel 6248 CPU @ 2.50 GHz (3.90 GHz with TurboBoost) |
Memory | 384 GB RAM | 384 GB RAM |
Disks | 10 x 2.5 in. 1.92 TB Intel S4510 RI SATA SSD | 10 x 2.5 in. 1.92 TB Intel S4510 RI SATA SSD |
NICs | Mellanox ConnectX-5 EX Dual Port 100 GbE | Mellanox ConnectX-5 EX Dual Port 100 GbE |
Persistent memory | None | 12 x 128 GB Intel Optane PM per node |
Volumes were created using three-way mirroring for the best balance between performance and resiliency. Three-way mirroring protects data by enabling the cluster to safely tolerate two hardware failures. For example, data on a volume would be successfully preserved even after the simultaneous loss of an entire node and a drive in another node.
Intel Optane PM has two operating modes – Memory Mode and App Direct Mode. Our tests used App Direct Mode. In App Direct Mode, the operating system uses Intel Optane PM as persistent memory distinct from DRAM. This mode enables extremely high performing storage that is byte-addressable-like, memory coherent, and cache coherent. Cache coherence is important because it ensures that data is a uniformly shared resource across all nodes. In the four-node Windows Server HCI cluster, cache coherence ensured that when data was read or written from one node that the same data was available across all nodes.
VMFleet is a storage load generation tool designed to perform I/O and capture performance metrics for Microsoft failover clusters. For the small block test, we used VMFleet to generate 100 percent reads at a 4K block size. The baseline configuration without Intel Optane PM sustained 2,103,412 IOPS at 1.5-millisecond (ms) average read latency. These baseline performance metrics demonstrated outstanding performance. However, OLTP databases target 1 ms or less latency for reads.
Comparatively, the Intel Optane PM cluster demonstrated 43 percent faster IOPS and decreased latency by 53 percent. Overall, this cluster sustained slightly over 3 million IOPS at .7 ms average latency. Benefits include:
When exploring storage responsiveness, testing large block read and write requests is also important. Data warehouses and decision-support systems are examples of workloads that read larger blocks of data. For this testing, we used 512 KB block sizes and sequential reads as part of the VMFleet testing. This test provided insight into the ability of Intel Optane PM cache to improve storage system throughput.
The cluster populated with Intel Optane PM was 109% faster than the baseline system. Our comparisons of 512 KB sequential reads found total throughput of 11 GB/s for the system without Intel Optane PM and 23 GB/s for the system with Intel Optane PM caching. Benefits include:
Overall, the VMFleet tests were impressive. Both Windows Server HCI configurations had 40 SSDs across the four nodes for approximately 76 TB of performant storage. To accelerate the entire cluster required 12 Intel Optane PM 128 GB modules per server for a total of 48 modules across the four nodes. Test results show that both OLTP and data-warehouse type workloads would exhibit significant performance improvements.
Testing 100 percent reads of 4K blocks showed:
Testing 512 KB sequential reads showed:
The configuration presented in this lab testing scenario will not be appropriate for every application. Any Windows Server HCI solution must be properly scoped and sized to meet or exceed the performance and capacity requirements of its intended workloads. Work with your Dell Technologies account team to ensure that your system is correctly configured for today’s business challenges and ready for expansion in the future. To learn more about Microsoft HCI Solutions from Dell Technologies, visit our Info Hub page.
Mon, 14 Dec 2020 15:37:06 -0000
|Read Time: 0 minutes
Dell EMC Integrated System for Microsoft Azure Stack Hub has been extending Microsoft Azure services to customer-owned data centers for over three years. Our platform has enabled organizations to create a hybrid cloud ecosystem that drives application modernization and to address business concerns around data sovereignty and regulatory compliance.
Dell Technologies, in collaboration with Microsoft, is excited to announce upcoming enhancements that will unlock valuable, real-time insights from local data using GPU-accelerated AI and ML capabilities. Actionable information can be derived from large on-premises data sets at the intelligent edge without sacrificing security.
Today, customers can order our Azure Stack Hub dense scale unit configuration with NVIDIA Tesla V100S GPUs for running compute-intensive AI processes like inferencing, training, and visualization from virtual machine or container-based applications. Some customers choose to run Kubernetes clusters on their hardware-accelerated Azure Stack Hub scale units to process and analyze data sent from IoT devices or Azure Stack Edge appliances. Powered by the Dell EMC PowerEdge R840 rack server, these NVIDIA Tesla V100S GPUs use Discrete Device Assignment (DDA), also known as GPU pass-through, to dedicate one or more GPUs to an Azure Stack Hub NCv3 VM.
The following figure illustrates the resources installed in each GPU-equipped Azure Stack Hub dense configuration scale unit node.
This month, our Dell EMC Azure Stack Hub release 2011 will also support the NVIDIA T4 GPU – a single-slot, low-profile adapter powered by NVIDIA Turing Tensor Cores. These GPUs are perfect for accelerating diverse cloud-based workloads, including light machine learning, inference, and visualization. These adapters can be ordered with Dell EMC Azure Stack Hub all-flash scale units powered by Dell EMC PowerEdge R640 rack servers. Like the NVIDIA Tesla V100S, these GPUs use DDA to dedicate one adapter’s powerful capabilities to a single Azure Stack Hub NCas_v4 VM. A future Azure Stack Hub release will also enable GPU partitioning on the NVIDIA T4.
The following figure illustrates the resources installed in each GPU-equipped Azure Stack Hub all-flash configuration scale unit node.
We are also pleased to announce a partnership with AMD to deliver GPU capabilities in our Dell EMC Integrated System for Microsoft Azure Stack Hub. Available today, customers can order our dense scale unit configuration with AMD Radeon Instinct MI25 GPUs aimed at graphics intensive visualization workloads like simulation, CAD applications, and gaming. The MI25 uses GPU partitioning (GPU-P) technology to allow users of an Azure Stack Hub NVv4 VM to consume only a portion of the GPU’s resources based on their workload requirements.
The following table is a summary of our hardware acceleration capabilities.
Following our stringent engineered approach, Dell Technologies goes far beyond considering GPUs as just additional hardware components in the Dell EMC Integrated System for Microsoft Azure Stack Hub portfolio. We apply our pedigree as leaders in appliance-based solutions to the entire lifecycle of all our scale unit configurations. The dense and all-flash scale unit configurations with integrated GPUs are designed to follow best practices and use cases specifically with Azure-based workloads, rather than workloads running on traditional virtualization platforms. Dell Technologies is also committed to ensuring a simplified experience for initial deployment, patch and update, support, and streamlined operations and monitoring for these new configurations.
There are a couple of additional details worth mentioning about our new Azure Stack Hub dense and all-flash scale unit configurations with hardware acceleration:
Mon, 17 Aug 2020 18:48:17 -0000
|Read Time: 0 minutes
Introduction
A vast array of services and tooling has evolved in support of microservices and container-based application development patterns. One indispensable asset in the technology value stream found in most of these patterns is Kubernetes (K8s). Technology professionals like K8s because it has become the de-facto standard for container orchestration. Business leaders like it for its potential to help disrupt their chosen marketplace. However, deploying and maintaining a Kubernetes cluster and its complimentary technologies can be a daunting task to the uninitiated.
Enter Microsoft Azure’s portfolio of services, tools, and documented guidance for developing and maintaining containerized applications. Microsoft continues to invest heavily in simplifying this application modernization journey without sacrificing features and functionality. The differentiators of the Microsoft approach are two-fold. First, the applications can be hosted wherever the business requirements dictate – i.e. public cloud, on-premises, or spanning both. More importantly, there is a single control plane, Azure Resource Manager (ARM), for managing and governing these highly distributed applications.
In this blog series, we share the results of hands-on testing in the Dell Technologies labs with container-related services that span both Public Azure and on-premises with Azure Stack Hub. Azure Stack Hub provides a discrete instance of ARM, which allows us to leverage a consistent control plane even in environments with no connectivity to the Internet. It might be helpful to review articles rationalizing the myriad of announcements made at Microsoft Ignite 2019 about Microsoft’s hybrid approach from industry experts like Kenny Lowe, Thomas Maurer, and Mary Branscombe before delving into the hands-on activities in this blog.
Services available in Public Azure
Azure Kubernetes Service (AKS) is a fully managed platform service hosted in Public Azure. AKS makes it simple to define, deploy, debug, and upgrade even the most complex Kubernetes applications. With AKS, organizations can accelerate past the effort of deploying and maintaining the clusters to leveraging the clusters as target platforms for their CI/CD pipelines. DevOps professionals only need to concern themselves with the management and maintenance of the K8s agent nodes and leave the management of the master nodes to Public Azure.
AKS is just one of Public Azure’s container-related services. Azure Monitor, Azure Active Directory, and Kubernetes role-based access controls (RBAC) provide the critical governance needed to successfully operate AKS. Serverless Kubernetes using Azure Container Instances (ACI) can add compute capacity without any concern about the underlying infrastructure. In fact, ACI can be used to elastically burst from AKS clusters when workload demand spikes. Azure Container Registry (ACR) delivers a fully managed private registry for storing, securing, and replicating container images and artifacts. This is perfect for organizations that do not want to store container images in publicly available registries like Docker Hub.
Microsoft is working diligently to deliver the fully managed AKS resource provider to Azure Stack Hub. The first step in this journey is to use AKS engine to bootstrap K8s clusters on Azure Stack Hub. AKS engine provides a command-line tool that helps you create, upgrade, scale, and maintain clusters. Customers interested in running production-grade and fully supported self-managed K8s clusters on Azure Stack Hub will want to use AKS engine for deployment and not the Kubernetes Cluster (preview) marketplace gallery item. This marketplace item is only for demonstration and POC purposes.
AKS engine can also upgrade and scale the K8s cluster it deployed on Azure Stack Hub. However, unlike the fully managed AKS in Public Azure, the master nodes and the agent nodes need to be maintained by the Azure Stack Hub operator. In other words, this is not a fully managed solution today. The same warning applies to the self-hosted Docker Container Registry that can be deployed to an on-premises Azure Stack Hub via a QuickStart template. Unlike ACR in Public Azure, Azure Stack Hub operators must consider backup and recovery of the images. They would also need to deploy new versions of the QuickStart template as they become available to upgrade the OS or the container registry itself.
If no requirements prohibit the sending of monitoring data to Public Azure and the proper connectivity exists, Azure Monitor for containers can be leveraged for feature-rich monitoring of the K8s clusters deployed on Azure Stack Hub with AKS engine. In addition, Azure Arc for Data Services can be leveraged to run containerized images of Azure SQL Managed Instances or Azure PostgreSQL Hyperscale on this same K8s cluster. The Azure Monitor and Azure Arc for Data Services options would not be available in submarine scenarios where there would be no connectivity to Azure whatsoever. In the disconnected scenario, the customer would have to determine how best to monitor and run data services on their K8s cluster independent of Public Azure.
Here is a summary of the articles in this blog post series:
Part 1: Running containerized applications on Microsoft Azure’s hybrid ecosystem – Provides an overview of the concepts covered in the blog series.
Part 2: Deploy K8s Cluster into an Azure Stack Hub user subscription – Setup an AKS engine client VM, deploy a cluster using AKS engine, and onboard the cluster to Azure Monitor for Containers.
Part 3: Deploy a self-hosted Docker Container Registry – Use one of the Azure Stack Hub QuickStart templates to setup container registry and push images to this registry. Then, pull these images from the registry into the K8s cluster deployed with AKS engine in Part 2.