Dell.com Contact Us

United States/English

Michael Lamia

My mission is to help organizations bridge the gap between the strategic and the tactical in the technology value stream. In my most recent role at Dell Technologies, this translates into creating impactful and educational collateral for Dell APEX Cloud Platform for Microsoft Azure.

LinkedIn: www.linkedin.com/in/michaellamia

containers Microsoft Kubernetes Docker Azure Stack Hub

Running containerized applications on Microsoft Azure's hybrid ecosystem - Introduction

Michael Lamia

Tue, 30 Apr 2024 12:54:21 -0000

Read Time: 0 minutes

Running containerized applications on Microsoft Azure’s hybrid ecosystem

Introduction

A vast array of services and tooling has evolved in support of microservices and container-based application development patterns. One indispensable asset in the technology value stream found in most of these patterns is Kubernetes (K8s). Technology professionals like K8s because it has become the de-facto standard for container orchestration. Business leaders like it for its potential to help disrupt their chosen marketplace. However, deploying and maintaining a Kubernetes cluster and its complimentary technologies can be a daunting task to the uninitiated.

Enter Microsoft Azure’s portfolio of services, tools, and documented guidance for developing and maintaining containerized applications. Microsoft continues to invest heavily in simplifying this application modernization journey without sacrificing features and functionality. The differentiators of the Microsoft approach are two-fold. First, the applications can be hosted wherever the business requirements dictate – i.e. public cloud, on-premises, or spanning both. More importantly, there is a single control plane, Azure Resource Manager (ARM), for managing and governing these highly distributed applications.

In this blog series, we share the results of hands-on testing in the Dell Technologies labs with container-related services that span both Public Azure and on-premises with Azure Stack Hub. Azure Stack Hub provides a discrete instance of ARM, which allows us to leverage a consistent control plane even in environments with no connectivity to the Internet. It might be helpful to review articles rationalizing the myriad of announcements made at Microsoft Ignite 2019 about Microsoft’s hybrid approach from industry experts like Kenny Lowe, Thomas Maurer, and Mary Branscombe before delving into the hands-on activities in this blog.

Services available in Public Azure

Azure Kubernetes Service (AKS) is a fully managed platform service hosted in Public Azure. AKS makes it simple to define, deploy, debug, and upgrade even the most complex Kubernetes applications. With AKS, organizations can accelerate past the effort of deploying and maintaining the clusters to leveraging the clusters as target platforms for their CI/CD pipelines. DevOps professionals only need to concern themselves with the management and maintenance of the K8s agent nodes and leave the management of the master nodes to Public Azure.

AKS is just one of Public Azure’s container-related services. Azure Monitor, Azure Active Directory, and Kubernetes role-based access controls (RBAC) provide the critical governance needed to successfully operate AKS. Serverless Kubernetes using Azure Container Instances (ACI) can add compute capacity without any concern about the underlying infrastructure. In fact, ACI can be used to elastically burst from AKS clusters when workload demand spikes. Azure Container Registry (ACR) delivers a fully managed private registry for storing, securing, and replicating container images and artifacts. This is perfect for organizations that do not want to store container images in publicly available registries like Docker Hub.

Leveraging the hybrid approach

Microsoft is working diligently to deliver the fully managed AKS resource provider to Azure Stack Hub. The first step in this journey is to use AKS engine to bootstrap K8s clusters on Azure Stack Hub. AKS engine provides a command-line tool that helps you create, upgrade, scale, and maintain clusters. Customers interested in running production-grade and fully supported self-managed K8s clusters on Azure Stack Hub will want to use AKS engine for deployment and not the Kubernetes Cluster (preview) marketplace gallery item. This marketplace item is only for demonstration and POC purposes.

AKS engine can also upgrade and scale the K8s cluster it deployed on Azure Stack Hub. However, unlike the fully managed AKS in Public Azure, the master nodes and the agent nodes need to be maintained by the Azure Stack Hub operator. In other words, this is not a fully managed solution today. The same warning applies to the self-hosted Docker Container Registry that can be deployed to an on-premises Azure Stack Hub via a QuickStart template. Unlike ACR in Public Azure, Azure Stack Hub operators must consider backup and recovery of the images. They would also need to deploy new versions of the QuickStart template as they become available to upgrade the OS or the container registry itself.

If no requirements prohibit the sending of monitoring data to Public Azure and the proper connectivity exists, Azure Monitor for containers can be leveraged for feature-rich monitoring of the K8s clusters deployed on Azure Stack Hub with AKS engine. In addition, Azure Arc for Data Services can be leveraged to run containerized images of Azure SQL Managed Instances or Azure PostgreSQL Hyperscale on this same K8s cluster. The Azure Monitor and Azure Arc for Data Services options would not be available in submarine scenarios where there would be no connectivity to Azure whatsoever. In the disconnected scenario, the customer would have to determine how best to monitor and run data services on their K8s cluster independent of Public Azure.

Here is a summary of the articles in this blog post series:

Part 1: Running containerized applications on Microsoft Azure’s hybrid ecosystem – Provides an overview of the concepts covered in the blog series.

Part 2: Deploy K8s Cluster into an Azure Stack Hub user subscription – Setup an AKS engine client VM, deploy a cluster using AKS engine, and onboard the cluster to Azure Monitor for Containers.

Part 3: Deploy a self-hosted Docker Container Registry – Use one of the Azure Stack Hub QuickStart templates to setup container registry and push images to this registry. Then, pull these images from the registry into the K8s cluster deployed with AKS engine in Part 2.

APEX Dell APEX Cloud Platform for Microsoft Azure ACP ACP for Azure ACP for MS Azure

Dell Technologies First to Deliver Azure Stack HCI 23H2

Michael Lamia

Wed, 24 Apr 2024 15:47:16 -0000

Read Time: 0 minutes

There is nothing quite like being first – first to watch the newly released docuseries on your favorite streaming platform, first to try the highly anticipated new restaurant, first to see the popular band that’s in town, and so on. These types of events tend to get everybody snapping selfies and posting memes on their social media accounts. As a bona fide nerd, I get that same feeling of exhilaration when cool new tech hits the market – especially when it’s from the Dell Technologies and Microsoft team. I love getting the word out about groundbreaking features that produce meaningful business outcomes for our customers.

In September 2023, we officially released our Dell APEX Cloud Platform for Microsoft Azure, the first offer in the market for Premier Solutions for Microsoft Azure Stack HCI. As the first partner to qualify a solution for this elite category, Dell Technologies is ready for greenfield deployments with Azure Stack HCI version 23H2 staged on our factory-delivered MC nodes beginning today. Dell Services is here to provide you with a white glove initial implementation experience.

In this blog, I want to share my enthusiasm about this 23H2 release and help the community understand why it’s such a big deal.

What’s all the fuss about 23H2?

Microsoft just announced the general availability of Azure Stack HCI version 23H2 last month. Pundits agree that this may be their most ambitious Azure Stack HCI release effort to date. They have dramatically simplified fleet management at-scale of infrastructure distributed across edge locations using Azure Resource Manager (ARM) and key Azure management services. On-premises resources like virtualized desktops, server VMs, and Azure Kubernetes Service (AKS) workload clusters are automatically Azure Arc-enabled. This means that these resources can benefit from Azure’s advanced configuration, monitoring, and security services immediately after the deployment of 23H2.

Topping the list of new features is cloud-based deployment. You can use the Azure portal to deploy Azure Stack HCI from the cloud, including cluster, storage, and networking configuration. You can also leverage ARM templates with custom parameter values for each unique cluster to drive reuse and repeatability. Dell Technologies plans on going beyond the cluster creation aspects of the deployment as we integrate with this new capability in our next release of the APEX Cloud Platform for Azure.

As depicted in the following early preview screenshots, we will continue to use our APEX Cloud Platform Foundation Software to provide a fully automated, end-to-end Day 1 deployment and cluster creation experience. This includes bare-metal OS provisioning and onboarding to Azure Arc prior to cluster creation in the Azure portal. We will also be able to seamlessly re-deploy existing clusters using our automation workflow if the need arises.

Figure 1. Early preview of Day 1 deployment and cluster creation workflow

Figure 2. Azure Stack HCI registration step in the early preview

The outcome is the same whether you leverage Dell Technologies’ existing deployment experience or wait for our new cloud-based experience coming in the next release. Both accelerate your time to value – taking you from factory-delivered MC nodes to fully deployed Azure hybrid cloud – using powerful API-driven software capabilities. Dell ProDeploy Services offers a white glove deployment experience that uses our APEX Cloud Platform Foundation Software Day 1 API to rapidly bring up any number of clusters in a predictable and repeatable manner.

During initial deployment of 23H2, Azure Arc Resource Bridge and AKS enabled by Azure Arc are automatically installed on your Azure Stack HCI cluster. This is an especially compelling enhancement, as installing Arc Resource Bridge and AKS on previous Azure Stack HCI versions has been notoriously challenging. Immediately after initial deployment, you can provision Arc-enabled VMs and Arc-enabled Kubernetes workload clusters across any number of on-premises Azure Stack HCI clusters centrally from ARM. You can use a guided, wizard-driven workflow in the Azure portal or ARM templates for Infrastructure as Code (IaC) automation.

Figure 3. Arc Resource Bridge running on three Azure Stack HCI clusters

Figure 4. Azure Arc VM provisioning

Azure Stack HCI version 23H2 also provides management of updates across all your Azure Stack HCI clusters using Azure Update Manager, as shown in the following figure. These updates are applied with the cluster-aware updating feature to prevent any disruption to running workloads. In the context of APEX Cloud Platform for Azure, you will be able to apply monthly quality and security updates using Azure Update Manager. However, baseline updates that include Dell’s BIOS, firmware, and driver packages will still require the full stack lifecycle management automation workflow in the APEX Cloud Platform extension in Windows Admin Center.

Figure 5. APEX Cloud Platform in Azure Update Manager

Azure Virtual Desktop (AVD) may be the most anxiously anticipated Azure service to come to on-premises Azure Stack HCI clusters to date. AVD is now generally available on 23H2 and offers host pool provisioning directly from the Azure portal. After a 23H2 deployment, you can begin creating Windows 10 and Windows 11 single- and multi-session host VMs across all your Azure Stack HCI clusters. These client VMs can also leverage updated Azure Marketplace images with Microsoft 365 applications preinstalled and GPU acceleration for your most demanding client applications.

There is also a bevy of new capabilities and improvements that addresses the core stack – hypervisor, storage, and VMs:

ReFS deduplication and compression is designed for active workloads like AVD running on Azure Stack HCI and can result in significant storage capacity savings.
Trusted launch comes to Azure Arc-enabled VMs to help prevent firmware and boot loader attacks.
Significant investments have been made to improve the Azure Stack HCI security posture in 23H2. This new version has a tailored security baseline with over 300 security settings configured that remain compliant using a drift control mechanism. Check out the newly published Azure Stack HCI Security Book, which provides a complete readout of all the robust security features that come out-of-the-box with 23H2.
Microsoft Defender for Cloud for Azure Stack HCI (preview) provides coverage for Azure Stack HCI infrastructure as part of the Cloud Security Posture Management plan.
Azure Migrate to Azure Stack HCI (preview) - Use Azure Migrate to move VMs from an existing Hyper-V environment to Azure Stack HCI version 23H2. This feature uses Azure Migrate as the control plane, but the data transfer stays entirely on-premises. Support for VMware vCenter source environments is coming soon.

Dell Technologies is first out of the gate

Premier Solutions for Microsoft Azure Stack HCI is reserved for top partners with the deepest levels of integration and engineering collaboration with Microsoft. Dell Technologies completed our testing and validation of 23H2 ahead of general availability on the Dell APEX Cloud Platform for Microsoft Azure. We are pleased to get the powerful capabilities of 23H2 into your hands immediately, so you can spend less time on operations and more time on the innovation that helps your organization secure a competitive advantage in the market. Dell ProDeploy Services is ready to provide a white glove 23H2 deployment experience on all new MC nodes delivered from the factory.

Figure 6. 23H2 release timeline

With all the new cloud-based capabilities Microsoft has introduced for Day 1 – N operations with 23H2, we want to be clear about how IT administrators will perform various tasks specifically with Dell APEX Cloud Platform for Microsoft Azure. Some administrative scenarios can be accomplished at-scale with Azure Resource Manager, and others will require the granular, cluster-by-cluster level access provided by the Dell APEX Cloud Platform Foundation Software. This software integrates automation workflows into Windows Admin Center via a Dell extension.

Figure 7. APEX Cloud Platform’s consistent management experience

The following table provides a detailed comparison of management capabilities.

Table 1. Comparison of fleet and cluster-level management capabilities

Task	Fleet Management with ARM	Cluster-Level Management with APEX Cloud Platform Foundation Software
Day 1 deployment	Cloud-based deployment from Azure will be integrated with our solution later in 2024.	Day 1 deployment and cluster creation automation currently performed by Dell ProDeploy Services.
Monitoring	Event Monitoring for Dell APEX Cloud Platform for Microsoft Azure feature in Azure Monitor Insights for Azure Stack HCI. This includes a Dell workbook for visualizing real-time hardware and software alerts.	Physical View of platform component inventory, monitoring, and alerting on a per cluster basis.
Lifecycle management	Azure Update Manager available for Azure Stack HCI monthly quality and security updates on APEX Cloud Platform for Azure. Baseline updates, including hardware, require APEX Cloud Platform Foundation Software.	Full stack lifecycle management keeps an individual cluster in a continuously validated state, progressing from one known good state to the next inclusive of OS, hardware, and systems management software.
Security management	Fully integrated with Microsoft Defender for Cloud (preview).	Toggle intrinsic infrastructure security management features, including Infrastructure Lock and secured core server.
Scale out and scale up	Not currently in scope.	Add Node and Add Disk features fully automate node and cluster expansion.
Node management	Not currently in scope.	Workflow available to repair and replace cluster nodes.
Serviceability and support	Not currently in scope.	Enables phone home, auto case creation, and remote connectivity to create a consolidated management, operations, and support experience.

Full stack lifecycle management

In the future, you will be able to leverage our full stack lifecycle management (LCM) experience in the Dell APEX Cloud Platform Foundation Software for in-place OS upgrades. Our software periodically queries Dell Technologies and Microsoft update sites, checking for new bundles. You never have to leave the Updates tab of the APEX Cloud Platform extension in Windows Admin Center, as shown in the following figure, to review or apply updates. The software identifies any patch dependencies that may exist before revealing these bundles in the Updates tab. Guardrails are established to ensure you apply all updates in the proper order. Dell Technologies and Microsoft collaboratively test and validate this process for every release using APEX Cloud Platform hardware in our respective engineering labs.

Figure 8. Update bundles in the APEX Cloud Platform extension in Windows Admin Center

The following table summarizes the contents of each update bundle type.

Table 2. Contents of update bundles

Update Bundle	Contents
Azure Stack HCI Solution (baseline)	Azure Arc infrastructure, Lifecycle Manager, and Operating System
APEX Cloud Platform Foundation Software	Cloud Platform Manager VM and all microservices-based systems management automation and orchestration software
APEX Cloud Platform Hardware	BIOS, iDRAC, firmware and driver updates

For a more in-depth discussion about this full stack lifecycle management feature, please review this recent blog post, The Evolution of Azure Stack HCI Lifecycle Management.

23H2 is only the beginning

Support of Azure Stack HCI version 23H2 is only one of the many enhancements we’ve introduced in our latest release of Dell APEX Cloud Platform for Microsoft Azure. We’ve also added new automation workflows to our extension in Windows Admin Center, which include many pre-checks and validations to ensure consistently successful operations with no disruption to running workloads:

Add and Replace Disks feature: We’ve provided an automated workflow to increase Storage Spaces Direct capacity and performance or restore capacity and performance to a desired state.
Node Repair and Replace feature: We’ve also made it easy to restore a cluster to full health after a node has experienced a failure that requires the server to be reimaged.

Dell Technologies is also developing integrations into Azure management and governance services. This latest platform release introduces the first of these integrations. You can now visualize fault and informational event data generated by the MC node hardware and the Cloud Platform Manager VM using an Azure Monitor Insights for Azure Stack HCI workbook. Simply enable the Event Monitoring for Dell APEX Cloud Platform for Microsoft Azure feature for Insights to get started.

Resources

We have tons of great content to help you deep-dive into Dell APEX Cloud Platform for Microsoft Azure powered by Dell APEX Cloud Platform Foundation Software:

What's New with the Dell APEX Cloud Platform for Microsoft Azure March 2024 Release
Monitoring the Dell APEX Cloud Platform for Microsoft Azure with Azure Insights
YouTube playlist with educational and demo videos
NEW YouTube playlist for March 2024 release
Info Hub white papers, videos, and interactive demos
APEX Cloud Platform for Azure main product page
Microsoft’s official announcement of 23H2 general availability
General availability of Azure Virtual Desktop
Azure Stack HCI Security Book
Check out What’s new for Azure edge infrastructure in 2023 for an eye-opening case study of a fictional grocery store chain that uses Microsoft Azure to deploy and manage infrastructure at the edge using Azure Arc, Azure Kubernetes Service, and Azure Stack HCI. This is a highly enlightening, end-to-end view of how all the technologies within the Azure hybrid cloud ecosystem can harmoniously work together to solve a real-world business problem in the retail vertical.

And as always, please reach out to your Dell Technologies account team if you would like to have more in-depth discussions about the Dell APEX Cloud Platforms family. If you don’t currently have a Dell Technologies contact, we’re here to help on our corporate website.

Author: Michael Lamia, Engineering Technologist at Dell Technologies

Follow me on X: @Evolving_Techie

LinkedIn: https://www.linkedin.com/in/michaellamia/

Windows Admin Center APEX Cloud Platforms Lifecycle management updates

The Evolution of Azure Stack HCI Lifecycle Management

Michael Lamia

Wed, 24 Apr 2024 15:39:15 -0000

Read Time: 0 minutes

Today, Dell Technologies announced the general availability of Dell APEX Cloud Platform for Microsoft Azure. This on-premises, turnkey infrastructure platform is collaboratively engineered with Microsoft to optimize the Azure hybrid cloud experience.

It is the first offer in Premier Solutions for Microsoft Azure Stack HCI, a new category in the Azure Stack HCI catalog reserved for key partners with the greatest levels of engagement with Microsoft and deepest integrations into familiar Microsoft management tools.

The secret sauce

Dell APEX Cloud Platform for Microsoft Azure comes bundled with fully automated management and orchestration, delivered by Dell APEX Cloud Platform Foundation Software. This software runs in a virtual appliance on each cluster and functions as the brains of the solution stack. The Cloud Platform Manager VM communicates with the underlying infrastructure and injects automation workflows into Microsoft Windows Admin Center via the Dell APEX Cloud Platform extension, as depicted in the following diagram.

Features that deliver breakthrough operational efficiency from Day 1 through Day 2/N include:

Deployment and cluster creation automation – Fastest path to Azure hybrid cloud providing an 88% reduction in steps versus a manual deployment approach.
Physical hardware views – Intuitive user interface for rapid identification of MC node components and cluster health.
Integrations with Dell ProSupport – Accelerates time to issue resolution with log collection, remote support, and phone home capabilities.
Intrinsic infrastructure security management – Toggle Dell Infrastructure Lock to prevent unauthorized changes to configuration settings and to block updates to the platform. Secured-core server establishes a hardware root of trust and provides firmware protection and virtualization-based security.
End-to-end cluster expansion – Scale-out a cluster in a highly efficient and fully automated manner using a guided wizard-driven workflow.

In this blog, we will focus on one of the most compelling and highly anticipated features of Dell APEX Cloud Platform Foundation Software – next generation full stack lifecycle management (LCM).

Our latest approach to LCM keeps Dell APEX Cloud Platform for Microsoft Azure operating in a Continuously Validated State (CVS) – advancing from one Known Good State (KGS) to the next inclusive of hardware, operating system, and systems management software. We have dramatically accelerated time to value with our latest approach to LCM, providing near instantaneous availability of new Microsoft updates within just four hours of being released.

The following graphic depicts the journey of an update from development to installation.

History lesson

Dell Technologies is no stranger to efficiently applying updates to Azure Stack HCI clusters, having done so using a fully automated, cluster-aware approach with no impact to running workloads since 2019.

We first introduced this automation in our Dell OpenManage Integration with Microsoft Windows Admin Center v1.1. At that time, we provided the ability to generate a compliance report within our standalone extension that compared the currently running BIOS, firmware, and driver versions with an engineering-validated solution baseline catalog. Simply choose between targeting an online catalog or creating an offline catalog using Dell Repository Manager, and then our standalone extension would orchestrate the updates using Cluster-Aware Updating.

Version 2.0 of our OpenManage Integration extension went a step further to deliver our first foray into full stack cluster-aware updating through a snap-in developed for Microsoft’s Updates extension.

Using this snap-in, Azure Stack HCI operating system updates and Dell hardware updates (i.e., BIOS, firmware, and drivers) were applied using a single, consolidated workflow. This workflow only required one reboot per cluster node and was completely non-disruptive to running workloads. Once again, IT administrators could view a compliance report and select an online or DRM-created offline catalog for the Dell updates.

Maintaining a Continuously Validated State

We’ve developed an entirely new Windows Admin Center extension with integrated Dell APEX Cloud Platform Foundation Software workflow automation. We continue to build on the pedigree we’ve established over the last four years with our OpenManage Integration extension, improving further by now incorporating proven and market-leading intellectual property (IP) from our other hyperconverged infrastructure (HCI) and software defined storage (SDS) offerings. Some of this innovative IP is derived from our highly successful VxRail HCI System software and results in a new standard for lifecycle management in a turnkey infrastructure platform.

When freshly deployed, Dell APEX Cloud Platform for Microsoft Azure runs at peak performance and resiliency to support your current workloads. The platform also comes secure by default with the following protection:

BIOS and operating system settings are configured correctly to enable secured-core server. Secured-core server establishes a hardware backed root of trust, provides defense against firmware level attacks, and enables virtualization-based security.
Data-at-rest encryption is enabled on all volumes using BitLocker.
Microsoft Defender Antivirus is built into Azure Stack HCI and provides real-time always-on antivirus protection with automatic definition updates.
Azure Stack HCI has more than 200 security settings enabled out-of-the-box. These settings provide a consistent security baseline. For example, security posture is improved by disabling legacy protocols and ciphers.
Windows Defender Application Control (WDAC) is a software-based security layer that reduces the attack surface by enforcing an explicit list of software that is allowed to run. Dell APEX Cloud Platform for Microsoft Azure comes with WDAC enabled and enforced by default.

This pristine operating environment is known as the platform’s current Known Good State (KGS). Rest assured that the entire platform is running in a condition that is collaboratively validated by Dell and Microsoft engineering. To maintain the robust default security posture and optimal performance and resiliency, you need to keep the platform in a Continuously Validated State (CVS). Comprehensively advancing the end-to-end platform from one KGS to the next is accomplished with zero interruption to running workloads. The following graphic shows an example of a quarterly update that includes multiple software and hardware update components.

(Note: The release versions in this graphic are examples only and do not align with any official Dell APEX Cloud Platform for Microsoft Azure planned releases.)

Release terminology

The following table summarizes the different platform components that must be routinely updated to be compliant with the current or target KGS.

Component	Provider	Description	Example versioning
Azure Stack HCI Solution	Microsoft	This contains OS quality and security updates, feature updates, emergency patches, and the Azure Stack HCI supplemental package	10.2306.1.11
Dell APEX Cloud Platform Foundation Software	Dell Technologies	All software and services running inside the Cloud Platform Manager virtual machine	01.00.00.00
Solution Builder Extension (SBE)	Dell Technologies	Package that contains all hardware updates for BIOS, iDRAC, firmware and drivers	4.0.2307.1

The Azure Stack HCI Solution component follows the Modern Lifecycle policy, which defines the products and services that are continuously serviced and supported. To keep your Azure Stack HCI service in a supported state, you have up to six months to install updates. Dell and Microsoft recommend installing all updates as they are released to capitalize on the rapid pace of innovation and inclusion of new features. To learn more, see Azure Stack HCI release information.

Dell and Microsoft release the following types of updates for this platform:

Update type	Description	Typical cadence
Baseline updates	Baseline updates include new features and improvements. They typically require host system reboots and might take longer.	Quarterly
Patch Updates	Patch updates primarily contain quality and reliability improvements. They might include OS LCUs or hot patches. Some patches require host system reboots, while others don't. To fix critical or security issues, patches might be released sooner than monthly.	Monthly
Hotfix	Hotfixes address blocking issues that could prevent regular patch or baseline updates.	On-demand

Microsoft Azure and Dell update sites are periodically queried to discover applicable updates. These updates are listed in the Updates tab in the Dell APEX Cloud Platform extension in Windows Admin Center.

All updates – even emergency patches from Microsoft that address critical security vulnerabilities or bug fixes – will appear in the Dell extension within just four hours of being released. This near immediate availability of patches is unprecedented in a turnkey, on-premises infrastructure platform. And whether the updates are from Microsoft, Dell, or both organizations, you’ll never need to navigate away from the Dell APEX Cloud Platform extension interface to apply them.

Engineering rigor produces stress-free LCM

In the past, Dell validated hardware updates and Microsoft validated operating system updates independently. With our enhanced lifecycle management approach, every update discovered by Dell APEX Cloud Platform Foundation Software has been jointly tested and validated by Dell and Microsoft. We incorporate new periodic builds of hardware, OS, and systems management components into our respective validation CI/CD pipelines. This raises the bar to an entirely new level of confidence and peace-of-mind for IT administrators.

In the relentless pursuit of delivering worry-free updates, the full stack lifecycle management workflow performs extensive prechecks before any update operations are initiated. For example, all platform components are checked to ensure they comply with the current KGS. If Dell Infrastructure Lock is enabled on the platform, a dialog box informs you that it will be temporarily disabled to allow updates and re-enabled after the update workflow is complete to maintain a strong security posture.

The entire update process relies heavily on Azure Stack HCI’s Lifecycle Manager feature, which employs Cluster-Aware Updating (CAU) to ensure no workloads are interrupted. One cluster node is placed into maintenance mode at a time, which triggers the Live Migration of VMs. CAU installs the updates, restarts the node if required, returns the node to a fully functional state, and proceeds to the next node in the cluster. When the LCM workflow is complete, a new compliance check is triggered to confirm that the platform has fully transitioned to the new target KGS.

Seeing is believing

The best way to summarize all the incredible benefits I’ve discussed about our evolved LCM approach is with a demo. Experience for yourself how stress-free LCM can be in this short video vignette.

Resources

We have tons of great content to help you deep-dive into Dell APEX Cloud Platform for Microsoft Azure powered by Dell APEX Cloud Platform Foundation Software.

InfoHub (White Papers, Blogs, Interactive Journey, and more) – https://infohub.delltechnologies.com/t/cloud-platforms/
YouTube playlist with educational and demo videos – https://www.youtube.com/playlist?list=PL2nlzNk2-VMEkNM7E8m0ia_lLHWlOuT5h
Main product page with spec sheets, solution briefs, infographics, and other great collateral – https://www.dell.com/azure
Dell Support site with administrator guides – https://www.dell.com/support/home/en/product-support/product/apex-cloud-pf-ms-azure/docs

And as always, please reach out to your Dell account team if you would like to have more in-depth discussions about the Dell APEX Cloud Platforms family. If you don’t currently have a Dell contact, we’re here to help on our corporate website.

Author: Michael Lamia, Engineering Technologist at Dell Technologies

Follow me on Twitter: @Evolving_Techie

LinkedIn: https://www.linkedin.com/in/michaellamia/

Email: michael.lamia@dell.com

APEX automation APEX Cloud Platform Foundation Microsoft Azure

Preview of Intelligent Automation in Dell APEX Cloud Platform for Microsoft Azure

Michael Lamia

Wed, 24 Apr 2024 15:35:21 -0000

Read Time: 0 minutes

UPDATE 11/7/2023: This blog and the embedded YouTube videos were published after Dell APEX Cloud Platform for Microsoft Azure was first announced at Dell Technologies World 2023. It contains early preview content.

Please proceed to the following links to see the most up-to-date collateral and YouTube demo videos created after the platform was generally available Sept. 2023.

https://www.youtube.com/playlist?list=PL2nlzNk2-VMEkNM7E8m0ia_lLHWlOuT5h

https://infohub.delltechnologies.com/t/cloud-platforms/

It was another exhilarating Dell Technologies World (DTW) back in May. It’s always fun connecting with colleagues, customers, and partners in Las Vegas. As always, Vegas managed to surprise me with something I’d never seen before. I finally witnessed the incredible iLuminate team up close and personal at the APEX After Dark party. I tried to describe the phenomenon to a friend who hasn’t experienced one of their performances, but words cannot adequately convey this mesmerizing spectacle of sight and sound! In the end, only one of my photos from the event and a link to one of their recorded shows could make it real for them.

Similarly, words alone can’t do justice to the game changing potential of the new APEX Cloud Platform announced at DTW. That’s why I created a demo video giving customers an early preview[1] of the new management and orchestration capabilities coming to our APEX Cloud Platform Foundation Software. This software integrates intelligent automation into the familiar management tools of each supported cloud ecosystem – Microsoft Azure, Red Hat OpenShift, and VMware vSphere.

In this blog, I want to showcase APEX Cloud Platform for Microsoft Azure and the features and functionality we integrate into Microsoft Windows Admin Center. My colleague and friend, Kenny Lowe, wrote a brilliant analysis of our new solution in his recent blog post, Delving Into the APEX Cloud Platform for Microsoft Azure. He included some screen shots from my demo video, which hasn’t been shared publicly until now. I highly recommend reading his enlightening article, which provides invaluable context before viewing the demos.

Please be aware that the clips below are sections of a lengthier video that shares the story of a fictional retail company named WhyGoBuy. They used APEX Cloud Platform Foundation Software to accelerate their time to value and improve operational efficiency. Because this video was over 15 minutes long, I divided it into bite-sized chunks and included a brief introduction to each administrative task. You can view the full video HERE.

Seeing is believing

Without further ado, let’s dive into the technology!

At initial release of APEX Cloud Platform for Microsoft Azure, Dell Technologies is offering a white-glove deployment experience through Dell ProDeploy Services. Our expert technicians will walk you through your first deployments to help you get comfortable with the process. Soon after announcing general availability, we will empower you to install the platform yourself using the APEX Cloud Platform Foundation Software deployment automation. In this first video, our administrators at WhyGoBuy followed the step-by-step user input configuration method and provided the settings in each step of the deployment wizard.

The next video presents a common Day 2 operations scenario. Some of WhyGoBuy’s Storage Spaces Direct volumes were approaching maximum capacity, and one volume required immediate attention. Luckily, APEX Cloud Platform for Microsoft Azure offered a consistent hybrid management experience. Administrators were promptly made aware of the issue through Azure Monitor, which provided observability for their entire fleet of platforms across data center and edge locations. Then, they navigated to the Windows Admin Center extension for further investigation and remediation of the issue.

Lifecycle management is critical to ensuring the optimal security, performance, and reliability of any infrastructure. With APEX Cloud Platform Foundation Software, Dell helps our customers remain in a continuously validated state – updating the platform from one known good state to the next, inclusive of hardware, operating system, and systems management software. A few months passed since WhyGoBuy deployed their first platform, and the time came to apply a quarterly baseline bundle using the Windows Admin Center extension. The following video captures their experience.

WhyGoBuy was committed to maintaining a robust security posture. They used APEX Cloud Platform Foundation Software intrinsic infrastructure security management features to help them accomplish this. The next video showcases two of these features:

Infrastructure Lock – Protects against unauthorized or malicious changes to configuration settings by enabling the System Lockdown feature in Dell iDRAC. This also prevents updates to BIOS, firmware, and drivers to guard against cybersecurity attacks.
Secured-core server – Proactively defends against many of the paths attackers might use to exploit a system by establishing a hardware root-of-trust, protecting firmware, and introducing virtualization-based security.

In this final video, WhyGoBuy set up connectivity to Dell ProSupport to benefit from log collection, phone home, automated case creation, and remote support. They also wanted to send telemetry data to Dell CloudIQ cloud-based software for multi-cluster monitoring. CloudIQ provided proactive monitoring, machine learning, and predictive analytics so they could take quick action and simplify operations of all their on-premises APEX Cloud Platforms.

The future’s so bright

We are excited to bring Dell APEX Cloud Platform for Microsoft Azure to market later this year. I’ve compiled the following list of available resources for further learning.

After we launch this solution, you’ll be able to find white papers, videos, blogs, and more at the APEX tile at our Info Hub site.

And as always, please reach out to your Dell account team if you would like to have more in-depth discussions about the APEX portfolio. If you don’t currently have a Dell contact, we’re here to help on our corporate website.

Author: Michael Lamia, Engineering Technologist at Dell Technologies

Follow me on Twitter: @Evolving_Techie

LinkedIn: https://www.linkedin.com/in/michaellamia/

Email: michael.lamia@dell.com

[1] Dell APEX Cloud Platform for Microsoft Azure will be generally available later in 2023. Some of the features and functionality depicted in these videos may behave differently at initial release or may not be available until later releases. Dell makes no representation and undertakes no obligations with regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”). Roadmap Information is provided by Dell as an accommodation to the recipient solely for the purposes of discussion and without intending to be bound thereby.

Azure Stack HCI Microsoft OpenManage lab demos HOL

Test Drive Azure Stack HCI in the Dell Demo Center!

Michael Lamia

Wed, 24 Apr 2024 15:29:51 -0000

Read Time: 0 minutes

A picture is worth a thousand words, but the value of a good hands-on lab is immeasurable!

Our newly minted interactive demo and hands-on lab are published in the Dell Technologies Demo Center:

The interactive demo (ITD-0910) offers an immersive look at all Azure Stack HCI management and governance features in Dell OpenManage Integration with Microsoft Windows Admin Center.
If you're seeking a deep-dive into the Dell Integrated System for Microsoft Azure Stack HCI initial deployment experience, you may prefer the PowerShell-heavy hands-on lab (HOL-0313-01).

In this blog, we'll begin with a brief introduction to these test drives. Then, we'll share our list of other virtual labs that will prove invaluable on your journey to becoming an Azure Stack HCI champion. Fasten your seatbelt and get ready to take your skills to the next level!

The interactive demo can be accessed directly by all customers and partners. When first navigating to the Demo Center site, remember to click the Sign In drop-down menu in the upper right corner of the page.

At the present time, you will not see the hands-on lab appear in the Demo Center catalog. You will need to contact your Dell Technologies account team to gain access to HOL-0313-01.

Interactive Demo ITD-0910

Taking this demo is like competing in a Formula1 or NASCAR race. It is fast-paced and remains within the secure confines of the track's guardrails. Each module in the demo guides you down a well-defined path that leads to a desired business outcome. Here is a summary of the benefits our OpenManage Integration extension delivers:

Uses automation to streamline operational efficiency and flexibility
Provides a consistent hybrid management experience by using a single Dell HCI Configuration Profile policy definition
Reduces operational expense by intelligently right-sizing infrastructure to match your workload profile
Ensures stability and security of infrastructure with real-time monitoring and lifecycle management
Protects your IT estate from costly changes to configuration settings made inadvertently or maliciously

Whenever new features are released for our extension, you'll be able to familiarize yourself with them here first. In the latest release (v3.0), we completely revamped the user interface for improved usability and navigation. We also added server and cluster-level checks to ensure that all prerequisites are met for seamless enablement of management and monitoring operations. The following figure illustrates the results of a prerequisite check. In the interactive demo, you learn more about these failures and how to use the OpenManage Integration extension to fix them.

When we first start driving, we need our parents and teachers to provide turn-by-turn directions. If you're exploring the extension for the first time, you'll want to keep the guides enabled to aid your understanding.

For example, consider the CPU Core Management feature. This feature allows you to right size your Azure Stack HCI cluster by enabling/disabling CPU cores to meet the requirements of your workload profile. It can also help save in subscription costs because Azure Stack HCI hosts are billed by CPU core per month. The guide in the following figure reminds you that a thorough analysis of your workload characteristics is essential prior to reducing the enabled CPU cores on this cluster.

After you've familiarized yourself with the talk track, you can leave your parents and teachers at home and drive through the demos without the detailed explanations. You can navigate using links alone by clicking the X in the upper right-hand corner of any guide. You might choose to proceed down this road to test your knowledge. As a Dell Technologies partner, you might want to create the illusion of performing a demo from a live environment to impress prospective clients.

Hands-on Lab HOL-0313-01

The Microsoft Azure Stack HCI Deployment hands-on lab in the Demo Center will appeal to the more mechanically inclined. It pops open the hood so you can get your hands dirty with all the PowerShell automation in our End-to-End Deployment Guide for Day 1 deployments. It is accompanied by an in-depth student manual to point you in the right direction, but there is a bit more freedom to go off-road compared with the interactive demo. Keep in mind that this is a virtual environment, so certain tasks that require the physical hardware may be limited.

This figure illustrates how you can drag and drop the PowerShell code into the console, so you aren't wasting time typing everything yourself:

We still show the GUI some love in the later portions of the lab. Failover Cluster Manager and Windows Admin Center make an appearance after you've used PowerShell to configure the hosts, create the cluster, configure a cluster witness, and enable Storage Spaces Direct (S2D). You'll be able to confirm the expected outcome at the command line using the graphical tools.

The following figure shows the step where you use Failover Cluster Manager to inspect the newly created storage pool after its created with PowerShell.

You'll also explore some of the management and monitoring capabilities in Windows Admin Center after adding your new cluster as a connection. This section of the HOL stops short of exploring the OpenManage Integration extension, though. We provide a link in the student manual to the interactive demo. If you’re not a fan of the layout of the lab shown in the following figure, you can rearrange the panes to fit your preferences. For example, you can open the manual in a separate window and allow your virtual desktop to consume all your screen real estate.

Other Opportunities to Get Hands-On

Maybe the interactive demo and hands-on lab don't meet your needs. Maybe you're looking to kick the tires on Azure Stack HCI without any training wheels. In that case, there are other options available to you. We have compiled a great list of resources that address a variety of use cases:

MSLab – Using this GitHub project, you can run entire virtual Azure Stack HCI environments directly on your laptop if it meets moderate hardware requirements. There are endless Azure hybrid scenarios available to try (Azure Kubernetes Service hybrid, Azure Virtual Desktop, Azure Arc portfolio, and so on), and new ones are added almost immediately after new features are released.
Dell GEOS Azure Stack HCI Hands-on Lab Guides – The Dell Global Engineering Outreach Specialists have crafted extensive guides to accompany the MSLab scenarios.
Dell Technologies & Microsoft | Hybrid Jumpstart – The goal of this jumpstart is to help you grow your knowledge, skills, and experience around several core hybrid cloud solutions from the Dell Technologies and Microsoft hybrid portfolio. This has many highly prescriptive hands-on modules and resembles more of a Pluralsight or Microsoft Learn course.
Azure Arc Jumpstart – If you want to skip the infrastructure deployment steps and get right into the key features of the Azure Arc portfolio, then this project is for you. All you need is an Azure subscription and a single resource group to get started immediately.
Dell Technologies Customer Solution Center – Speak with your Dell Technologies account team to schedule a personalized engagement with a Customer Solution Center. Our dedicated subject matter experts can help you with extensive Proofs of Concept with your target workloads.

If you're looking for educational materials on Azure Stack HCI, like white papers, blogs, and videos, visit our Info Hub and main product page.

Be sure to follow me on Twitter @Evolving_Techie and LinkedIn.

Author: Mike Lamia

containers Microsoft Kubernetes Azure Stack Hub

Deploy K8s clusters into Azure Stack Hub user subscriptions

Michael Lamia

Mon, 24 Jul 2023 15:06:01 -0000

Read Time: 0 minutes

Deploy K8s Cluster into an Azure Stack Hub user subscription

Welcome to Part 2 of a three-part blog series on running containerized applications on Microsoft Azure’s hybrid ecosystem. Part 1 provided the conceptual foundation and necessary context for the hands-on efforts documented in parts 2 and 3. This article details the testing we performed with AKS engine and Azure Monitor for containers in the Dell Technologies labs.

Here are the links to all the series articles:

Part 1: Running containerized applications on Microsoft Azure’s hybrid ecosystem – Provides an overview of the concepts covered in the blog series.

Part 2: Deploy K8s Cluster into an Azure Stack Hub user subscription – Setup an AKS engine client VM, deploy a cluster using AKS engine, and onboard the cluster to Azure Monitor for containers.

Introduction to the test lab

Before proceeding with the results of the testing with AKS engine and Azure Monitor for containers, we’d like to begin by providing a tour of the lab we used for testing all the container-related capabilities on Azure Stack Hub. Please refer to the following two tables. The first table lists the Dell EMC for Microsoft Azure Stack hardware and software. The second table details the various resource groups we created to logically organize the components in the architecture. The resource groups were provisioned into a single user subscription.

Scale Unit Information	Value
Number of scale unit nodes	4
Total memory capacity	1 TB
Total storage capacity	80 TB
Logical cores	224
Azure Stack Hub version	1.1908.8.41
Connectivity mode	Connected to Azure
Identity store	Azure Active Directory

Resource Group	Purpose
demoaksengine-rg	Contains the resources associated with the client VM running AKS engine.
demoK8scluster-rg	Contains the cluster artifacts deployed by AKS engine.
demoregistryinfra-rg	Contains the storage account and key vault supporting the self-hosted Docker Container Registry.
demoregistrycompute-rg	Contains the VM running Docker Swarm and the self-hosted container registry containers and the other supporting resources for networking and storage.
kubecodecamp-rg	Contains the VM and other resources used for building the Sentiment Analysis application that was instantiated on the K8s cluster.

Please also refer to the following graphic for a high-level overview of the architecture.

Prerequisites

All the step-by-step instructions and sample artifacts used to deploy AKS engine can be found in the Azure Stack Hub User Documentation and GitHub. We do not intend to repeat what has already been written – unless we want to specifically highlight a key concept. Instead, we will share lessons we learned while following along in the documentation.

We decided to test the supported scenario whereby AKS engine deploys all cluster artifacts into a new resource group rather than deploying the cluster to an existing VNET. We also chose to deploy a Linux-based cluster using the kubernetes-azurestack.json API model file found in the AKS engine GitHub repository. A Windows-based K8s cluster cannot be deployed by AKS engine to Azure Stack Hub at the time of this writing (November 2019). Do not attempt to use the kubernetes-windows.json file in the GitHub repository, as this will not be fully functional.

Addressing the prerequisites for AKS engine was very straight forward:

Ensured that the Azure Stack Hub system was completely healthy by running a Test-Azure Stack.
Verified sufficient available memory, storage, and public IP address capacity on the system.
Verified that the quotas embedded in the user subscription’s offers provided enough capacity for all the Kubernetes capabilities.
Used marketplace syndication to download the appropriate gallery items. We made sure to match the version of AKS engine to the correct version of the AKS Base Image. In our testing, we used AKS engine v0.43.0, which depends on version 2019.10.24 for the AKS Base Image.
Collected the name and ID of the Azure Stack Hub user subscription created in the lab.
Created the Azure AD service principal (SPN) through the Public Azure Portal, but a link is provided to use programmatic means to create the SPN, as well.
Assigned the SPN to the Contributor role of the user subscription.
Generated the SSH key and private key file using PuTTY Key Generator (PUTTYGEN) for each of the Linux VMs used during testing. We associated the PPK file extension with PUTTYGEN on the management workstation so the saved private key files are opened within PUTTYGEN for easy copy and pasting. We used these keys with PuTTY and WinSCP throughout the testing.

At this point, we built a Linux client VM for running the AKS engine command-line tool used to deploy and manage the K8s cluster. Here are the specifications of the VM provisioned:

Resource group: demoaksengine-rg
VM size: Standard DS1v2 (1 vcpu + 3.5 GB memory)
Managed disk: 30 GB default managed disk
Image: Latest Ubuntu 16.04-LTS gallery item from the Azure Stack Hub marketplace
Authentication type: SSH public key
Assigned a Public IP address so we could easily SSH to it from the management workstation
Since this VM was required for ongoing management and maintenance of the K8s cluster, we took the appropriate precautions to ensure that we could recover this VM in case of a disaster. A critical file that needed to be protected on this VM after the cluster was created is the apimodel.json file. This will be discussed later in this blog series.

Our Azure Stack Hub runs with certificates generated from an internal standalone CA in the lab. This means we needed to import our CA’s root certificate into the client VM’s OS certificate store so it could properly connect to the Azure Stack Hub management endpoints before going any further. We thought we would share the steps to import the certificate:

The standalone CA provides a file with a .cer extension when requesting the root certificate. In order to work with an Ubuntu certificate store, we had to convert this to a .crt file by issuing the following command from Git Bash on the management workstation:

openssl x509 -inform DER -in certnew.cer -out certnew.crt
Created a directory on the Ubuntu client VM for the extra CA certificate in /usr/share/ca-certificates:

sudo mkdir /usr/share/ca-certificates/extra
Transferred the certnew.crt file to the Ubuntu VM using WinSCP to the /home/azureuser directory. Then, we copied the file from the home directory to the /usr/share/ca-certificates/extra directory.

sudo cp certnew.crt /usr/share/ca-certificates/extra/certnew.crt
Appended a line to /etc/ca-certificates.conf.

sudo nano /etc/ca-certificates.conf

Add the following line to the very end of the file:

extra/certnew.crt
Updated certs non-interactively

sudo update-ca-certificates

This command produced the following output that verified that the lab’s CA certificate was successfully added:

Note: We learned that if this procedure to import a CA’s root certificate is ever carried out on a server already running Docker, you have to stop and re-start Docker at this point. This is done on Ubuntu via the following commands:

sudo systemctl stop docker
sudo systemctl start docker

We then SSH’d into the client VM. While in the home directory, we executed the prescribed command in the documentation to download the get-akse.sh AKS engine installation script.

curl -o get-akse.sh https://raw.githubusercontent.com/Azure/aks-engine/master/scripts/get-akse.sh
chmod 700 get-akse.sh
./get-akse.sh --version v0.43.0

Once installed, we issued the aks-engine version command to verify a successful installation of AKS engine.

Deploy K8s cluster

There was one more step that needed to be taken before issuing the command to deploy the K8s cluster. We needed to customize a cluster specification using an example API Model file. Since we were deploying a Linux-based Kubernetes cluster, we downloaded the kubernetes-azurestack.json file into the home directory of the client VM. Though we could have used nano on the Linux VM to customize the file, we decided to use WinSCP to copy this file over to the management workstation so we could use VS Code to modify it instead. Here are a few notes on this:

Through issuing an aks-engine get-versions command, we noticed that cluster versions up to 1.17 were supported by AKS engine v0.43.0. However, the Azure Monitor for Container solution only supported versions 1.15 and earlier. We decided to leave the orchestratorRelease key value in the kubernetes-azurestack.json file at the default value of 1.14.
Inserted https://portal.rack04.azs.mhclabs.com as the portalURL.
Used labcluster as the dnsPrefix. This was the DNS name for the actual cluster. We confirmed that this hostname was unique in the lab environment.
Left the adminUsername at azureuser. Then, we copied and pasted the public key from PUTTYGEN into the keyData field. The following screen shot shows what was pasted.

Copied the modified file back to the home directory of the client VM.

While still in the home directory of the client VM, we issued the command to deploy the K8s cluster. Here is the command that was executed. Remember that the client ID and client secret are associated with the SPN, and the subscription ID is that of the Azure Stack Hub user subscription.

aks-engine deploy \
--azure-env AzureStackCloud \
--location Rack04 \
--resource-group demoK8scluster-rg \
--api-model ./kubernetes-azurestack.json \
--output-directory demoK8scluster-rg \
--client-id xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx \
--client-secret xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx \
--subscription-id xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx

Once the deployment was complete and verified by deploying mysql with Helm, we copied the apimodel.json file found in the home directory of the client VM under a subdirectory with the name of the cluster’s resource group – in this case demoK8scluster-rg – to a safe location outside of the Azure Stack Hub scale unit. This file was used as input in all of the other AKS engine operations. The generated apimodel.json contains the service principal, secret, and SSH public key used in the input API Model. It also has all the other metadata needed by the AKS engine to perform all other operations. If this gets lost, AKS engine won't be able configure the cluster down the road.

Onboarding the new K8s cluster to Azure Monitor for containers

Before introducing our first microservices-based application to the K8s cluster, we preferred to onboard the cluster to Azure Monitor for containers. Azure Monitor for containers not only provides a rich monitoring experience for AKS in Public Azure but also for K8s clusters deployed in Azure Stack using AKS engine. We wanted to see what resources were being used only by the Kubernetes system functions before deploying any applications. The steps we performed in this section were performed on one of the K8s primary nodes using an SSH connection.

The prerequisites were fairly straight forward, but we did make a couple observations while stepping through them:

We decided to perform the onboarding using the HELM chart. For this option, the latest version of the HELM client was required. We found that as long as we performed the steps under Verify your cluster in the documentation, we did not encounter any issues.
Since we were only running a single cluster in our environment, we found we didn’t have to do anything with regards to configuring kubectl or the HELM client to use the K8s cluster context.
We did not have to make any changes in the environment to allow the various ports to communicate properly within the cluster or externally with Public Azure.

Note: Microsoft also supports the enabling of monitoring on this K8s cluster on Azure Stack Hub deployed with AKS engine using the API definition as an alternative to using the HELM chart. The API definition option doesn’t have a dependency on Tiller or any other component. Using this option, monitoring can be enabled during the cluster creation itself. The only manual step for this option would be to add the Container Insights solution to the Log Analytics workspace specified in the API definition.

We already had a Log Analytics Workspace at our disposal for this testing. We did make one mistake during the onboarding preparations, though. We intended to manually add the Container Insights solution to the workspace but installed the legacy Container Monitoring solution instead of Container Insights. To be on the safe side, we ran the onboarding_azuremonitor_for_containers.ps1 PowerShell script and supplied the values for our resources as parameters. The script skipped the creation of the workspace since we already had one and just installed the Container Insights solution via an ARM template in GitHub identified in the script.

At this point, we could simply issue the HELM commands under the Install the chart section of the documentation. Besides inserting the workspace ID and workspace key, we also replaced the --name parameter value with azuremonitor-containers. We did not observe anything out of the ordinary during the onboarding process. Once complete, we had to wait about 10 minutes before we could go to the Public Azure portal and see our cluster appear. We had to remember to click on the drop-down menu for Environment in the Container Insights section in Azure Monitor and select “Azure Stack (Preview)” for the cluster to appear.

We hope this blog post proves to be insightful to anyone deploying a K8s cluster on Azure Stack Hub using AKS engine. We also trust that the information provided will assist in the onboarding of that new cluster to Azure Monitor for containers. In Part 3, we will step through our experience deploying a self-hosted Docker Container Registry into Azure Stack Hub.

Azure Stack HCI HCI Azure Arc

Exclusive Preview of Dell Azure Stack HCI Arc Integrated Configuration Compliance

Michael Lamia

Tue, 01 Mar 2022 20:39:03 -0000

Read Time: 0 minutes

Who doesn’t enjoy VIP treatment? Exciting opportunities to feel like royalty include winning box seats at a sporting event or getting invited to attend opening night at a new restaurant. I received an unexpected upgrade to business class on a flight a couple years ago and remember texting every celebratory meme I could find to friends and family! These are the moments in life to really savor.

In my line of work as a technical marketing engineer, I relish any situation where VIP stands for Very Important Person rather than Virtual IP address. Private previews of the latest technology often provide both flavors of VIP.

I consider myself fortunate to be among the first to experience cutting-edge solutions with the potential to solve today’s most vexing business challenges. I also get direct access to the best minds in the software and hardware industry. They welcome my feedback, and there’s no better feeling than knowing that I’ve made a meaningful contribution to a product that will benefit the broader community! Now it’s your turn to feel the thrill of gaining early access to long-awaited new software capabilities for Azure Stack HCI.

Your official preview invitation has arrived.

You are cordially invited to participate in an exclusive VIP preview of Azure Stack HCI Configuration and Policy Compliance Visibility from Dell Technologies, integrated with Azure Arc.

The Azure Arc portfolio demonstrates the unique Microsoft approach to delivering hybrid cloud by extending Azure platform services and management capabilities to data center, edge, and multi-cloud environments. Dell Technologies uses the Azure Policy guest configuration feature and Azure Arc-enabled servers to audit software and hardware settings in Dell Integrated System for Microsoft Azure Stack HCI.

Our engineering-validated integrated system is Azure hybrid by design and delivers efficient operations using our Dell OpenManage Integration with Microsoft Windows Admin Center extension and snap-ins.

When we first developed our extension, we delivered deep hardware monitoring, inventory, and troubleshooting capabilities. Over the last few years, we have collected valuable feedback from preview programs to drive further investment and innovation into our extension. Customer experience has helped us shape new features including:

One-click full stack lifecycle management using Cluster-Aware Updating
Automated cluster creation and expansion
Dynamic CPU core management
Intrinsic infrastructure security management

The Azure Arc integration from Dell Technologies complements Windows Admin Center and our OpenManage extension by applying robust governance services to the integrated system. Our Azure Arc integration creates software and hardware compliance policies for near real-time detection of infrastructure configuration drift at-scale. It protects clusters in the data center or geographically dispersed to ROBO and edge locations from malicious threats and inadvertent changes to operating system, BIOS, iDRAC, and network adapter settings on AX nodes from Dell Technologies. Without this visibility, you leave yourself vulnerable to security breaches, costly downtime, and degraded application performance.

All we need now is your experience and valuable feedback to help us fine-tune this critical capability!

Consider Azure Portal your observation deck.

Intentionally selected AX node attributes and values targeted by our Azure Arc integration are routinely checked for compliance against pre-defined business rules. Then, compliance results are visualized in the Policy blade of the Azure portal as shown in the following screen shots.

Help prevent costly business setbacks.

This guided preview is checking select OS-level, cluster-level, BIOS, iDRAC, and network adapter attributes that optimize Azure Stack HCI. If an unapproved change to these attribute values goes undetected, the integrated system may experience degradation to performance, availability, and security. The abnormal behavior of the system may not be readily traced back to the modified OS and hardware setting – delaying Mean Time to Repair (MTTR). The longer the incident takes to resolve, the greater the consequences to your business in the form of decreased productivity, lost revenue, or tarnished reputation.

Ready for your sneak peek?

Here are just some of the preview benefits in store:

Playing with the newest toys in your own sandbox and directly with the engineers creating the solution
Helping to make a cutting-edge technology even better with a vendor who is listening and responding to your feedback
Achieving superhero status at your business by automating routine administrative tasks that strengthen infrastructure integrity and improve operational efficiency

Availability is limited for this guided preview. To claim your spot, please contact your account manager right away. They will coordinate with the internal teams at Dell Technologies and schedule further conversations with you. A professional services engagement is required to install the Azure Arc integration during the preview phase. We will work together to prepare the Azure artifacts and run the required scripts. Over time, Dell Technologies intends to expand this compliance visibility to a much larger set of attributes in an extensible, user-friendly framework.

I hope you’re as excited as I am to deliver this configuration and policy compliance visibility using Azure Arc to Dell Integrated System for Microsoft Azure Stack HCI. The technical previews that I’ve been a part of have been some of the most memorable and rewarding experiences of my career. An unexpected upgrade to business class is nice but contributing to the success of a technology that will help my industry peers for years to come? Priceless.

Author: Michael Lamia

Twitter: @Evolving_Techie

LinkedIn: https://www.linkedin.com/in/michaellamia/

Azure Stack HCI HCI hybrid cloud Windows Admin Center life cycle management

Experts Recommend Automation for a Healthier Lifestyle

Michael Lamia

Wed, 20 Oct 2021 19:59:25 -0000

Read Time: 0 minutes

Like any good techie, I can get a little obsessed with gadgets that improve my quality of life. Take, for example, my recent discovery of wearable technology that eases the symptoms of motion sickness. For most of my life, I’ve had to take over-the-counter or prescription medicine when boating, flying, and going on road trips. Then, I stumbled across a device that I could wear around my wrist that promised to solve the problem without the side effects. Hesitantly, I bought the device and asked a friend to drive like a maniac around town while I sat in the back seat. It actually worked – no headache, no nausea, and no grogginess from meds! Needless to say, I never leave home without my trusty gizmo to keep motion sickness at bay.

Throughout my career in managing IT infrastructure, stress has affected my quality of life almost as much as motion sickness. There is one responsibility that has always caused more angst than anything else: lifecycle management (LCM). To narrow that down a bit, I’m specifically talking about patching and updating IT systems under my control. I have sometimes been derelict in my duties because of annoying manual steps that distract me from working on the fun, highly visible projects. It’s these manual steps that can cause the dreaded DU/DL (data unavailable or data loss) to rear its ugly head. Can you say insomnia?

Innovative technology to the rescue once again! While creating a demo video last year for our Dell EMC OpenManage Integration with Microsoft Windows Admin Center (OMIMSWAC), I was blown away by how easy we made the BIOS, firmware, and driver updates on clusters. The video did a pretty good job of showing the power of the Cluster-Aware Updating (CAU) feature, but it didn’t go far enough. I needed to quantify its full potential to change an IT profressional’s life by pitting an OMIMSWAC’s automated, CAU approach against a manual, node-based approach. I captured the results of the bake off in Dell EMC HCI Solutions for Microsoft Windows Server: Lifecycle Management Approach Comparison.

Automation Triumphs!

For this white paper to really stand the test of time, I knew I needed to be very clever to compare apples-to-apples. First, I referred to HCI Operations Guide—Managing and Monitoring the Solution Infrastructure Life Cycle, which detailed the hardware updating procedures for both the CAU and node-based approaches. Then, I built a 4-node Dell EMC HCI Solutions for Windows Server 2019 cluster, performed both update scenarios, and recorded the task durations. We all know that automation is king, but I didn’t expect the final tally to be quite this good:

The automated approach reduced the number of steps in the process by 82%.
The automated approach required 90% less of my focused attention. In other words, I was able to attend to other duties while the updates were installing.
If I was in a production environment, the maintenance window approved by the change control board would have been cut in half.
The automated process left almost no opportunity for human error.

As you can see from the following charts taken from the paper, these numbers only improved as I extrapolated them out to the maximum Windows Server HCI cluster size of 16 nodes.

I thought these results were too good to be true, so I checked my steps about 10 times. In fact, I even debated with my Marketing and Product Management counterparts about sharing these claims with the public! I could hear our customers saying, “Oh, yeah, right! These are just marketecture hero numbers.” But in this case, I collected the hard data myself. I am still confident that these results will stand up to any scrutiny. This is reality – not dreamland!

Just when I thought it couldn’t get any better

So why am I blogging about a project I did last year? Just when I thought the testing results in the white paper couldn’t possibly get any better, Dell EMC Integrated System for Microsoft Azure Stack HCI came along. Azure Stack HCI is Microsoft’s purpose-built operating system delivered as an Azure service. The current release when writing this blog was Azure Stack HCI, version 20H2. Our Solution Brief provides a great overview of our all-in-one validated HCI system, which delivers efficient operations, flexible consumption models, and end-to-end enterprise support and services. But what I’m most excited about are two lifecycle management enhancements – 1-click full stack LCM and Kernel Soft Reboot – that will put an end to the old adage, “If it looks too good to be true, it probably is.”

Let’s invite OS updates to the party

OMIMSWAC was at version 1.1 when I did my testing last year. In that version, the CAU feature focused on the hardware – BIOS, firmware, and drivers. In OMIMSWAC v2.0, we developed an exclusive snap-in to Microsoft’s Failover Cluster Tool Extension to create 1-click full stack LCM. Only available for clusters running Azure Stack HCI, a simple workflow in Windows Admin Center automates not only the hardware updates – but also the operating system updates. How do I see this feature lowering my blood pressure?

Applying the OS and hardware updates can typically require multiple server reboots. With 1-click full stack LCM, reboots are delayed until all updates are installed. A single reboot per node in the cluster results in greater time savings and shorter maintenance windows.
I won’t have to use multiple tools to patch different aspects of my infrastructure. The more I can consolidate the number of management tools in my environment, the better.
A simple, guided workflow that tightly integrates the Microsoft extension and OMIMSWAC snap-in ensures that I won’t miss any steps and provides one view to monitor update progress.
The OMIMSWAC snap-in provides necessary node validation at the beginning of the hardware updates phase of the workflow. These checks verify that my cluster is running validated AX nodes from Dell Technologies and that all the nodes are homogeneous. This gives me peace of mind knowing that my updates will be applied successfully. I can also rest assured that there will be no interruption to the workloads running in my VMs and containers since this feature leverages CAU.
The hardware updates leverage the Microsoft HCI solution catalog from Dell Technologies. Each BIOS, firmware, and driver in this catalog is validated by our engineering team to optimize the Azure Stack HCI experience.

The following screen shots were taken from the full stack CAU workflow. The first step indicates which OS updates are available for the cluster nodes.

Node validation is performed first before moving forward with hardware updates.

If the Windows Admin Center host is connected to the Internet, the online update source approach obtains all the systems management utilities and the engineering validated solution catalog automatically. If operating in an edge or disconnected environment, the solution catalog can be created with Dell EMC Repository Manager and placed on a file server share accessible from the cluster nodes.

The following image shows a generated compliance report. All non-compliant components are selected by default for updating. After this point, all the OS and non-compliant hardware components will be updated together with only a single reboot per node in the cluster and with no impact to running workloads.

Life is too short to wait for server reboots

Speaking of reboots, Kernel Soft Reboot (KSR) is a new feature coming in Azure Stack HCI, version 21H2 that also has the potential to make my white paper claims even more jaw dropping. KSR will give me the ability to perform a “software-only restart” on my servers – sparing me from watching the paint dry during those long physical server reboots. Initially, the types of updates in scope will be OS quality and security hotfixes since these don’t require BIOS/firmware initialization. Dell Technologies is also working on leveraging KSR for the infrastructure updates in a future release of OMIMSWAC.

KSR will be especially beneficial when using Microsoft’s CAU extension in Windows Admin Center. The overall time savings using KSR multiplies for clusters because faster restarts means less resyncing of data after CAU resumes each cluster node. Each node should reboot with Mach Speed if there are only Azure Stack HCI OS hotfixes and Dell EMC Integrated System infrastructure updates that do not require the full reboot. I will definitely be hounding my Product Managers and Engineering team to deliver KSR for infrastructure updates in our OMIMSWAC extension ASAP.

Bake off rematch

I decided to hold off on doing a new bakeoff until Azure Stack HCI, version 21H2 is released with KSR. I also want to wait until we bring the benefits of KSR to OMIMSWAC for infrastructure updates. The combination of OMIMSWAC 1-click full stack CAU and KSR will continue to make OMIMSWAC unbeatable for seamless lifecycle management. This means better outcomes for our organizations, improved blood pressure and quality of life for IT pros, and more motion-sickness-free adventure vacations. I’m also looking forward to spending more time learning exciting new technologies and less time with routine administrative tasks.

If you’d like to get hands-on with all the different features in OMIMSWAC, check out the Interactive Demo in Dell Technologies Demo Center. Also, check out my other white papers, blogs, and videos in the Dell Technologies Info Hub.

containers Microsoft Kubernetes Docker Azure Stack Hub

Deploy a self-hosted Docker Container Registry on Azure Stack Hub

Michael Lamia

Thu, 22 Jul 2021 10:35:43 -0000

Read Time: 0 minutes

Deploy a self-hosted Docker Container Registry on Azure Stack Hub

Introduction

Welcome to Part 3 of a three-part blog series on running containerized applications on Microsoft Azure’s hybrid ecosystem. In this part, we step through how we deployed a self-hosted, open-source Docker Registry v2 to an Azure Stack Hub user subscription. We also discuss how we pushed a microservices-based application to the self-hosted registry and then pulled those images from that registry onto the K8s cluster we deployed using AKS engine.

Here are the links to all the series articles:

Part 1: Running containerized applications on Microsoft Azure’s hybrid ecosystem – Provides an overview of the concepts covered in the blog series.

Part 2: Deploy K8s Cluster into an Azure Stack Hub user subscription – Setup an AKS engine client VM, deploy a cluster using AKS engine, and onboard the cluster to Azure Monitor for containers.

Part 3: Deploy a self-hosted Docker Container Registry – Use one of the Azure Stack Hub QuickStart templates to setup container registry and push images to this registry. Then, pull these images from the registry into the K8s cluster deployed with AKS engine in Part 2.

Here is a diagram depicting the high-level architecture in the lab for review.

There are a few reasons why an organization may want to deploy and maintain a container registry on-premises rather than leveraging a publicly accessible registry like Docker Hub. This approach may be particularly appealing to customers in air-gapped deployments where there is unreliable connectivity to Public Azure or no connectivity at all. This can work well with K8s clusters that have been deployed using AKS engine at military installations or other such locales. There may also be regulatory compliance, data sovereignty, or data gravity issues that can be addressed with a local container registry. Some customers may simply require tighter control over where their images are being stored or need to integrate image storage and distribution tightly into their in-house development workflows.

Prerequisites

The deployment of Docker Registry v2 into an Azure Stack Hub user subscription was performed using the 101-vm-linux-docker-registry Azure Stack Hub QuickStart template. There are two phases to this deployment:

1. Creation of a storage account and key vault in a discrete resource group by running the setup.ps1 PowerShell script provided in the template repo.

2. The setup.ps1 script also created an azuredeploy.parameters.json file that is used in the PowerShell command to deploy the ARM template that describes the container registry VM and its associated resources. These resources get provisioned into a separate resource group.

Be aware that there is another experimental repository in GitHub called msazurestackworkloads for deploying Docker Registry. The main difference is that the code in the msazurestackworkloads repo includes artifacts for creating a marketplace gallery item that an Azure Stack Hub user could deploy into their subscription. Again, this is currently experimental and is not supported for deploying the Docker Registry.

One of the prerequisite steps was to obtain an X.509 SSL certificate in PFX format for loading into key vault. We pointed to its location during the running of the setup.ps1 script. We used our lab’s internal standalone CA to create the certificate, which is the same CA used for deploying the K8s cluster with AKS engine. We thought we’d share the steps we took to obtain this certificate in case any readers aren’t familiar with the process. All these steps must be completed from the same workstation to ensure access to the private key.

We performed these steps on the management workstation:

1. Created an INF file that looked like the following. The subject’s CN is the DNS name we provided for the registry VM’s public IP address.

[Version]

Signature="$Windows NT$"

[NewRequest]

Subject = "CN=cseregistry.rack04.cloudapp.azs.mhclabs.com,O=CSE Lab,L=Round Rock,S=Texas,C=US"

Exportable = TRUE

KeyLength = 2048

KeySpec = 1

KeyUsage = 0xA0

MachineKeySet = True

ProviderName = "Microsoft RSA SChannel Cryptographic Provider"

HashAlgorithm = SHA256

RequestType = PKCS10

[Strings]

szOID_SUBJECT_ALT_NAME2 = "2.5.29.17"

szOID_ENHANCED_KEY_USAGE = "2.5.29.37"

szOID_PKIX_KP_SERVER_AUTH = "1.3.6.1.5.5.7.3.1"

szOID_PKIX_KP_CLIENT_AUTH = "1.3.6.1.5.5.7.3.2"

[Extensions]

%szOID_SUBJECT_ALT_NAME2% = "{text}dns=cseregistry.rack04.cloudapp.azs.mhclabs.com"

%szOID_ENHANCED_KEY_USAGE% = "{text}%szOID_PKIX_KP_SERVER_AUTH%,%szOID_PKIX_KP_CLIENT_AUTH%"

[RequestAttributes]

2. We used the certreq.exe command to generate a CSR that I then submitted to the CA.

certreq.exe -new cseregistry_req.inf cseregistry_csr.req

3. We received a .cer file back from the standalone CA and followed the instructions here to convert this .cer file to a .pfx file for use with the container registry.

Prepare Azure Stack PKI certificates for deployment or rotation

We also needed to have CA’s root certificate in .crt file format. We originally obtained this during the K8s cluster deployment using AKS engine. This needs to be imported into the certificate store of any device that intends to interact with the container registry.

Deploy the container registry supporting infrastructure

We used the setup.ps1 PowerShell script included in the QuickStart template’s GitHub repo for creating the supporting infrastructure for the container registry. We named the new resource group created by this script demoregistryinfra-rg. This resource group contains a storage account and key vault. The registry is configured to use the Azure storage driver to persist the container images in the storage account blob container. The key vault stores the credentials required to authenticate to the registry as a secret and secures the certificate. A service principal (SPN) created prior to executing the setup.ps1 script is leveraged to access the storage account and key vault.

Here are the values we used for the variables in our setup.ps1 script for reference (sensitive information removed, of course). Notice that the $dnsLabelName value is only the hostname of the container registry server and not the fully qualified domain name.

$location = "rack04"
$resourceGroup = "registryinfra-rg"
$saName = "registrystorage"
$saContainer = "images"

$kvName = "registrykv"
$pfxSecret = "registrypfxsecret"
$pfxPath = "D:\file_system_path\cseregistry.pfx"
$pfxPass = "mypassword"
$spnName = "Application (client) ID"
$spnSecret = "Secret"
$userName = "cseadmin"
$userPass = "!!123abc"

$dnsLabelName = "cseregistry"
$sshKey = "SSH key generated by PuttyGen"
$vmSize = "Standard_F8s_v2"
$registryTag = "2.7.1"
$registryReplicas = "5"

Deploy the container registry using the QuickStart template

We deployed the QuickStart template using a similar PowerShell command as the one indicated in the README.md of the GitHub repo. Again, the azuredeploy.parameters.json file was created automatically by the setup.ps1 script. This was very straightforward. The only thing to mention is that we created a new resource group when deploying the QuickStart template. We could have also selected an existing resource group that did not contain any resources.

Testing the Docker Registry with the Sentiment Analyzer application

At this point, it was time to test the K8s cluster and self-hosted container registry running on Azure Stack Hub from end-to-end. For this, we followed a brilliant blog article entitled Learn Kubernetes in Under 3 Hours: A Detailed Guide to Orchestrating Containers written by Rinor Maloku. This was a perfect introduction to the world of creating a microservices-based application running in multiple containers. It covers Docker and Kubernetes fundamentals and is an excellent primer for anyone just getting started in the world of containers and container orchestration. The name of the application is Sentiment Analyser, and it uses text analysis to ascertain the emotion of a sentence.

Learn Kubernetes in Under 3 Hours: A Detailed Guide to Orchestrating Containers

We won’t share all the notes we took while walking through the article. However, there are a couple tips we wanted to highlight as they pertain to testing the K8s cluster and new self-hosted Docker Registry in the lab:

The first thing we did to prepare for creating the Sentiment Analyser application was to setup a development Ubuntu 18.04 VM in a unique resource group in our lab’s user subscription on Azure Stack Hub. We installed Firefox on the VM and Xming on the management workstation so we could test the functionality of the application at the required points in the process. Then, it was just a matter of setting up PuTTY properly with X11 forwarding.
Before installing anything else on the VM, we imported the root certificate from our lab’s standalone CA. This was critical to facilitate the secure connection between this VM and the registry VM for when we started pushing the images.
Whenever Rinor’s article talked about pushing the container images to Docker Hub, we instead targeted the self-hosted registry running on Azure Stack Hub. We had to take these steps on the development VM to facilitate those procedures:

Logged into the container registry using the credentials for authentication that we specified in the setup.ps1 script.
Tagged the image to target the container registry.

sudo docker tag <image ID> cseregistry.rack04.cloudapp.azs.mhclabs.com/sentiment-analysis-frontend

Pushed the image to the container registry.

sudo docker push cseregistry.rack04.cloudapp.azs.mhclabs.com/sentiment-analysis-frontend

Once the images were pushed to the registry, we found out that we could view them from the management workstation by browsing to the following URL and authenticating to the registry:

https://cseregistry.rack04.cloudapp.azs.mhclabs.com/v2/_catalog

The following output was given after we pushed all the application images:

{"repositories":["sentiment-analysis-frontend","sentiment-analysis-logic","sentiment-analysis-web-app"]}

We noticed that we did not have to import the lab’s standalone CA’s root certificate on the primary node before attempting to pull the image from the container registry. We assumed that the cluster picked up the root certificate from the Azure Stack Hub system, as there was a file named /etc/ssl/certs/azsCertificate.pem on the primary node from where we were running kubectl.
Prior to attempting to create the K8s pod for the Sentiment Analyser frontend, we had to create a Kubernetes cluster secret. This is always necessary when pulling images from private repositories – whether they are private repos on Docker Hub or privately hosted on-premises using a self-hosted container registry. We followed the instructions here to create the secret:

Pull an Image from a Private Registry

In the section of this article entitled Create a Secret by providing credentials on the command line, we discovered a couple items to note:

When issuing the kubectl create secret docker-registry command, we had to enclose the password in single quotes because it was a complex password.

kubectl create secret docker-registry regcred --docker-
server=cseregistry.rack04.cloudapp.azs.mhclabs.com --docker-username=cseadmin --docker-
password='!!123abc' --docker-email=<email address>

We’ve now gotten into the habit of verifying the secret after I create it by using the following command:

kubectl get secret regcred --output="jsonpath={.data.\.dockerconfigjson}" | base64 --decode

Then, when we came to the modification of the YAML files in the article, we learned just how strict YAML is about the formatting of the file. When we added the imagePullSecrets: object, we had to ensure it perfectly lined up with the containers: object. Also, interesting to note is that the numbers in the right-hand column were not necessary to duplicate.

Here is the content of the file that worked, but this blog post interface will not be able to display the indentation correctly:

apiVersion: v1
kind: Pod      # 1
metadata:
name: sa-frontend
labels:
app: sa-frontend   # 2
spec: # 3
containers:
- image: cseregistry.rack04.cloudapp.azs.mhclabs.com/sentiment-analysis-frontend # 4
name: sa-frontend      # 5
ports:
- containerPort: 80      # 6
imagePullSecrets:
- name: regcred

At this point, we were able to observe the fully functional Sentiment Analyser application running on our K8s cluster on Azure Stack Hub. We were not only running this application on-premises in a highly prescriptive, self-managed Kubernetes cluster, but we were also able to do so while leveraging a self-hosted Docker Registry for the transferring of the images. We could also proceed to Azure Monitor for containers using the Public Azure portal to monitor our running containerized application and create thresholds for timely alerting on any potential issues.

Articles in this blog series

We hope this blog post proves to be insightful to anyone deploying a self-hosted container registry on Azure Stack Hub. It has been a great learning experience stepping through the deployment of a containerized application using the Microsoft Azure toolset. There are so many other things we want to try like Deploying Azure Cognitive Services to Azure Stack Hub and using Azure Arc to run Azure data services on our K8s cluster on Azure Stack Hub. We look forward to sharing more of our testing results on these exciting capabilities in the near future.

Storage Spaces Direct Hyper-V AMD AX nodes ROBO

Value Optimized AX-6515 for ROBO Use Cases

Michael Lamia

Wed, 16 Jun 2021 13:35:49 -0000

Read Time: 0 minutes

Introduction

Small offices and remote branch office (ROBO) use cases present special challenges for IT organizations. The issues tend to revolve around how to implement a scalable, resilient, secure, and highly performant platform at an affordable TCO. The infrastructure must be capable enough to efficiently run a highly diversified portfolio of applications and services and yet be simple to deploy, update, and support by a local IT generalist. Dell Technologies and Microsoft help you accelerate business outcomes in these unique ROBO environments with our Dell EMC Solutions for Microsoft Azure Stack HCI.

In this blog post, we share VMFleet results observed in the Dell Technologies labs for our newest AX-6515 two-node configuration – ideal for ROBO environments. Optimized for value, the small but powerful AX-6515 node packs a dense, single-socket 2^nd Gen AMD EPYC processor in a 1RU chassis delivering peak performance and excellent TCO. We also included the Dell EMC PowerSwitch S5212F-ON in our testing to provide 25GbE network connectivity for the storage, management, and VM traffic in a small form factor. The Dell EMC Solutions for Azure Stack HCI Deployment Guide was followed to construct the test lab and applies only to infrastructure that is built with validated and certified AX nodes running Microsoft Windows Server 2019 Datacenter from Dell Technologies.

We were quite impressed with the VMFleet results. First, we stressed the cluster’s storage subsystem to its limits using scenarios aimed at identifying maximum IOPS, latency, and throughput. Then, we adjusted the test parameters to be more representative of real-world workloads. The following summary of findings indicated to us that this two-node, AMD-based, all-flash cluster could meet or exceed the performance requirements of workload profiles often found in ROBO environments:

Achieved over 1 million IOPS at microsecond latency using a 4k block size and 100% random-read IO pattern.
Achieved over 400,000 IOPS at 4 millisecond latency using a 4k block size and 100% random-write IO pattern.
Using 512k block sizes, drove 6 GB/s and 12 GB/s throughput for 100% sequential-write and 100% sequential-read IO patterns, respectively.
Using a range of real-world scenarios, the cluster achieved hundreds of thousands of IOPS at under 7 milliseconds latency and drove between 5 – 12 GB/s of sustained throughput.

Lab Setup

The following diagram illustrates the environment created in the Dell Technologies labs for the VMFleet testing. Ancillary services required for cluster operations such as DNS, Active Directory, and a file server for cluster quorum are not depicted.

Figure 1 Network topology

Table 1 Cluster configuration

Cluster Design Elements	Description
Number of cluster nodes	2
Cluster node model	AX-6515 nodes
Number of network switches for RDMA and TCP/IP traffic	2
Network switch model	Dell EMC PowerSwitch S5212F-ON
Network topology	Fully-converged network configuration. RDMA and TCP/IP traffic traversing 2 x 25GbE network connections from each host.
Network switch for OOB management	Dell EMC PowerSwitch S3048-ON
Resiliency option	Two-way mirror
Usable storage capacity	Approximately 12 TB

Table 2 Cluster node resources

Resources per Cluster Node	Description
CPU	Single-socket AMD EPYC 7702P 64-Core Processor
Memory	256 GB DDR4 RAM
Storage controller for OS	BOSS-S1 adapter card
Physical drives for OS	2 x Intel 240 GB M.2 SATA drives configured as RAID 1
Storage controller for Storage Spaces Direct (S2D)	HBA330 Mini
Physical drives	8 x 1.92 TB Mixed Use KIOXIA SAS SSDs
Network adapter	Mellanox ConnectX-5 Dual Port 10/25GbE SFP28 Adapter
Operating System	Windows Server 2019 Datacenter

The architectures of Azure Stack HCI solutions are highly opinionated and prescriptive. Each design is extensively tested and validated by Dell Technologies Engineering. Here is a summary of the key quality attributes that define these architectures followed by a section devoted to our performance findings.

Efficiency – Many customers are interested in improving performance and gaining efficiencies by modernizing their aging virtualization platforms with HCI. Using Azure Stack HCI helps avoid a DIY approach to IT infrastructure, which is prone to human error and is more labor intensive.
Maintainability – Our solution makes it simple to incorporate hybrid capabilities to reduce operational burden using Microsoft Windows Admin Center (WAC). Services in Azure can also be leveraged to avoid additional on-premises investments for management, monitoring, BCDR, security, and more. We have also developed the Dell EMC OpenManage Integration with Microsoft Windows Admin Center to assist with hardware monitoring and to simplify updates with Cluster Aware Updates (CAU).
Availability – Using a two-way mirror, we always have two copies of our data. This configuration can survive a single drive failure in one node or survive an entire node failure. However, the cluster cannot survive two failures simultaneously on both nodes. In case greater resiliency is desired, volumes can be created using nested resiliency. Nested resiliency is discussed in more detail in the "Optional modifications to the architecture" section later in this blog post.
Supportability – Support is provided by dedicated Dell Technologies ProSupport Plus and ProSupport for Software technicians who have expertise specifically tailored to Azure Stack HCI solutions.

Testing Results

We leveraged VMFleet to benchmark the storage subsystem of our 2-node cluster. Many Microsoft customers and partners rely on this tool to help them stress test their Azure Stack HCI clusters. VMFleet consists of a set of PowerShell scripts that deploy virtual machines to a Hyper-V cluster and execute Microsoft’s DiskSpd within those VMs to generate IO. The following table presents the range of VMFleet and DiskSpd parameters used during our testing in the Dell Technologies labs.

Table 3 Test parameters

VMFleet and DiskSpd Parameters	Values
Number of VMs running per node	20
vCPUs per VM	2
Memory per VM	8 GB
VHDX size per VM	40 GB
VM Operating System	Windows Server 2019
Cluster Shared Volume (CSV) in-memory read cache size	0
Block sizes (B)	4k – 512k
Thread count (T)	2
Outstanding IOs (O)	32
Write percentages (W)	0, 20, 50, 100
IO patterns (P)	Random, Sequential

We first selected DiskSpd scenarios aimed at identifying the maximum IOPS, latency, and throughput thresholds of the cluster. By pushing the limits of the storage subsystem, we confirmed that the networking, compute, operating systems, and virtualization layer were configured correctly according to our Deployment Guide and Network Integration and Host Network Configuration Options guide. This also ensured that that no misconfiguration occurred during initial deployment that could skew the real-world storage performance results. Our results are depicted in Table 4.

Table 4 Maximums test results

Scenario	Parameter Values Explained	Performance Metric
B4-T2-O32-W0-PR	Block size: 4k Thread count: 2 Outstanding IO: 32 IO pattern: 100% random read	IOPS: 1,146,948 Read latency: 245 microseconds CPU utilization: 48%
B4-T2-O32-W100-PR	Block size: 4k Thread count: 2 Outstanding IO: 32 IO pattern: 100% random write	IOPS: 417,591 Write latency: 4 milliseconds CPU utilization: 25%
B512-T2-O2-W0-PSI	Block size: 512k Thread count: 2 Outstanding IO: 8 IO pattern: 100% sequential read	Throughput: 12 GB/s
B512-T2-O2-W100-PSI	Block size: 512k Thread count: 2 Outstanding IO: 8 IO pattern: 100% sequential write	Throughput: 6 GB/s

We then stressed the storage subsystem using IO patterns more reflective of the types of workloads found in a ROBO use case. These applications are typically characterized by smaller block sizes, random I/O patterns, and a variety of read/write ratios. Examples include general enterprise and small office LOB applications and OLTP workloads. The testing results in Figure 2 below indicate that the cluster has the potential to accelerate OLTP workloads and make enterprise applications highly responsive to end users.

Figure 2 Performance results with smaller block sizes

Other services like backups, streaming video, and large dataset scans have larger block sizes and sequential IO patterns. With these workloads, throughput becomes the key performance indicator to analyze. The results shown in the following graph indicate an impressive sustained throughput that can greatly benefit this category of IT services and applications.

Figure 3 Performance results with larger block sizes

Optional modifications to the architecture

Customers could make modifications to the lab configuration to accommodate different requirements in the ROBO use case. For example, Dell Technologies completely supports a dual-link full mesh topology for Azure Stack HCI. This non-converged storage switchless topology eliminates the need for network switches for storage communications and enables you to use existing infrastructure for management and VM traffic. This approach will result in similar or improved performance metrics versus those mentioned in this blog due to the 2 x 25 GB direct connections between the nodes and the isolation of the storage traffic on these dedicated connections.

Figure 4 Two-node back-to-back architecture option

There may be situations in ROBO scenarios where there are no IT generalists near the site to address hardware failures. When a drive or entire node fails, it may take days or weeks before someone can service the nodes and return the cluster to full functionality. Consider nested resiliency instead of two-way mirroring to handle multiple failures on a two-node cluster. Inspired by RAID 5 + 1 technology, workloads remain online and accessible even in the following circumstances:

Figure 5 Nested resiliency option

Be aware that there is a capacity efficiency cost when using nested resiliency. Two-way mirroring is 50% efficient, meaning 1 TB of data takes up 2 TB of physical storage capacity. Depending on the type of nested resiliency you choose to configure, capacity efficiency can range between 25% - 40%. Therefore, ensure you have an adequate amount of raw storage capacity if you intend to use this technology. Performance is also going to be affected when using nested resiliency – especially on workloads with a higher percentage of write IO since more copies of the data need to be maintained on the cluster.

If you need greater flexibility in cluster resources, Dell Technologies offers Azure Stack HCI configurations to meet any workload profile and business requirement. The table below shows the different resource options available for each AX node. To find more detailed specifications about these configurations, please review the detailed product specifications on our product page.

Table 5 Azure Stack HCI configuration options

Visit our website for more details on Dell EMC Solutions for Azure Stack HCI.

Intel Azure Stack HCI Storage Spaces Direct Hyper-V Microsoft AX nodes Optane

Boost Performance on Dell EMC HCI Solutions for Microsoft Server using Intel Optane Persistent Memory

Anil Papisetty Michael Lamia

Wed, 16 Jun 2021 13:35:49 -0000

Read Time: 0 minutes

Modern IT applications have a broad range of performance requirements. Some of the most demanding applications use Online Transactional Processing (OLTP) database technology. Typical organizations have many mission critical business services reliant on workloads powered by these databases. Examples of such services include online banking in the financial sector and online shopping in the retail sector. If the response time of these systems is slow, customers will likely suffer a poor user experience and may take their business to competitors. Dissatisfied customers may also express their frustration through social media outlets resulting in incalculable damage to a company’s reputation.

The challenge in maintaining an exceptional consumer experience is providing databases with performant infrastructure while also balancing capacity and cost. Traditionally, there have been few cost-effective options that cache database workloads, which would greatly improve end-user response times. Intel Optane persistent memory (Intel Optane PM) offers an innovative path to accelerating database workloads. Intel Optane PM performs almost as well as DRAM, and the data is preserved after a power cycle. We were interested in quantifying these claims in our labs with Dell EMC HCI Solutions for Microsoft Windows Server.

Windows Server HCI running Microsoft Windows Server 2019 provides industry-leading virtual machine performance with Microsoft Hyper-V and Microsoft Storage Spaces Direct technology. The platform supports Non-Volatile Memory Express (NVMe), Intel Optane PM, and Remote Direct Memory Access (RDMA) networking. Windows Server HCI is a fully productized, validated, and supported HCI solution that enables enterprises to modernize their infrastructure for improved application uptime and performance, simplified management and operations, and lower total cost of ownership. AX nodes from Dell EMC, powered by industry-leading PowerEdge server platforms, offer a high-performance, scalable, and secure foundation on which to build a software-defined infrastructure.

In our lab testing, we wanted to observe the impact on performance when Intel Optane PM was added as a caching tier to a Windows Server HCI cluster. We set up two clusters to compare. One cluster was configured as a two-tier storage subsystem with Intel Optane PM in the caching tier and SATA Read-Intensive SSDs in the capacity tier. We inserted 12 x 128 GB Intel Optane PM modules into this cluster for a total of 1.5 TB per node. The other cluster’s storage subsystem was configured as a single-tier of SATA Read-Intensive SSDs. With respect to CPU selection, memory, and Ethernet adapters, the two clusters were configured identically.

Only the Dell EMC AX-640 nodes currently accommodate Intel Optane PM. The clusters were configured as follows:

Cluster Resources	Without Intel Optane PM	With Intel Optane PM
Number of nodes	4	4
CPU	2 x Intel 6248 CPU @ 2.50 GHz (3.90 GHz with TurboBoost)	2 x Intel 6248 CPU @ 2.50 GHz (3.90 GHz with TurboBoost)
Memory	384 GB RAM	384 GB RAM
Disks	10 x 2.5 in. 1.92 TB Intel S4510 RI SATA SSD	10 x 2.5 in. 1.92 TB Intel S4510 RI SATA SSD
NICs	Mellanox ConnectX-5 EX Dual Port 100 GbE	Mellanox ConnectX-5 EX Dual Port 100 GbE
Persistent memory	None	12 x 128 GB Intel Optane PM per node

Volumes were created using three-way mirroring for the best balance between performance and resiliency. Three-way mirroring protects data by enabling the cluster to safely tolerate two hardware failures. For example, data on a volume would be successfully preserved even after the simultaneous loss of an entire node and a drive in another node.

Intel Optane PM has two operating modes – Memory Mode and App Direct Mode. Our tests used App Direct Mode. In App Direct Mode, the operating system uses Intel Optane PM as persistent memory distinct from DRAM. This mode enables extremely high performing storage that is byte-addressable-like, memory coherent, and cache coherent. Cache coherence is important because it ensures that data is a uniformly shared resource across all nodes. In the four-node Windows Server HCI cluster, cache coherence ensured that when data was read or written from one node that the same data was available across all nodes.

VMFleet is a storage load generation tool designed to perform I/O and capture performance metrics for Microsoft failover clusters. For the small block test, we used VMFleet to generate 100 percent reads at a 4K block size. The baseline configuration without Intel Optane PM sustained 2,103,412 IOPS at 1.5-millisecond (ms) average read latency. These baseline performance metrics demonstrated outstanding performance. However, OLTP databases target 1 ms or less latency for reads.

Comparatively, the Intel Optane PM cluster demonstrated 43 percent faster IOPS and decreased latency by 53 percent. Overall, this cluster sustained slightly over 3 million IOPS at .7 ms average latency. Benefits include:

Significant performance improvement in IOPS means transactional databases and similar workloads will improve in scalability.
Applications reading from storage will receive data faster, thus improving transactional response times.
Intel Optane PM coherent cache provides substantial performance benefits without sacrificing availability.

When exploring storage responsiveness, testing large block read and write requests is also important. Data warehouses and decision-support systems are examples of workloads that read larger blocks of data. For this testing, we used 512 KB block sizes and sequential reads as part of the VMFleet testing. This test provided insight into the ability of Intel Optane PM cache to improve storage system throughput.

The cluster populated with Intel Optane PM was 109% faster than the baseline system. Our comparisons of 512 KB sequential reads found total throughput of 11 GB/s for the system without Intel Optane PM and 23 GB/s for the system with Intel Optane PM caching. Benefits include:

Greater throughput enables faster scans of data for data warehouse systems, decision-support systems, and similar workloads.
The benefit to the business is faster reporting and analytics.
Intel Optane PM coherent cache provides substantial throughput benefits without sacrificing availability.

Overall, the VMFleet tests were impressive. Both Windows Server HCI configurations had 40 SSDs across the four nodes for approximately 76 TB of performant storage. To accelerate the entire cluster required 12 Intel Optane PM 128 GB modules per server for a total of 48 modules across the four nodes. Test results show that both OLTP and data-warehouse type workloads would exhibit significant performance improvements.

Testing 100 percent reads of 4K blocks showed:

43 percent performance improvement in IOPS.
53 percent decrease in average read latency.
Improved scaling and faster transaction processing. Overall, application performance would be significantly accelerated, improving end-user experience.

Testing 512 KB sequential reads showed:

109 percent increased throughput.
Faster reporting and faster time to analytics and data insights.

The configuration presented in this lab testing scenario will not be appropriate for every application. Any Windows Server HCI solution must be properly scoped and sized to meet or exceed the performance and capacity requirements of its intended workloads. Work with your Dell Technologies account team to ensure that your system is correctly configured for today’s business challenges and ready for expansion in the future. To learn more about Microsoft HCI Solutions from Dell Technologies, visit our Info Hub page.

AI NVIDIA Microsoft machine learning AMD Azure Stack Hub

GPU-Accelerated AI and ML Capabilities

Michael Lamia

Mon, 14 Dec 2020 15:37:06 -0000

Read Time: 0 minutes

Dell EMC Integrated System for Microsoft Azure Stack Hub has been extending Microsoft Azure services to customer-owned data centers for over three years. Our platform has enabled organizations to create a hybrid cloud ecosystem that drives application modernization and to address business concerns around data sovereignty and regulatory compliance.

Dell Technologies, in collaboration with Microsoft, is excited to announce upcoming enhancements that will unlock valuable, real-time insights from local data using GPU-accelerated AI and ML capabilities. Actionable information can be derived from large on-premises data sets at the intelligent edge without sacrificing security.

Partnership with NVIDIA

Today, customers can order our Azure Stack Hub dense scale unit configuration with NVIDIA Tesla V100S GPUs for running compute-intensive AI processes like inferencing, training, and visualization from virtual machine or container-based applications. Some customers choose to run Kubernetes clusters on their hardware-accelerated Azure Stack Hub scale units to process and analyze data sent from IoT devices or Azure Stack Edge appliances. Powered by the Dell EMC PowerEdge R840 rack server, these NVIDIA Tesla V100S GPUs use Discrete Device Assignment (DDA), also known as GPU pass-through, to dedicate one or more GPUs to an Azure Stack Hub NCv3 VM.

The following figure illustrates the resources installed in each GPU-equipped Azure Stack Hub dense configuration scale unit node.

This month, our Dell EMC Azure Stack Hub release 2011 will also support the NVIDIA T4 GPU – a single-slot, low-profile adapter powered by NVIDIA Turing Tensor Cores. These GPUs are perfect for accelerating diverse cloud-based workloads, including light machine learning, inference, and visualization. These adapters can be ordered with Dell EMC Azure Stack Hub all-flash scale units powered by Dell EMC PowerEdge R640 rack servers. Like the NVIDIA Tesla V100S, these GPUs use DDA to dedicate one adapter’s powerful capabilities to a single Azure Stack Hub NCas_v4 VM. A future Azure Stack Hub release will also enable GPU partitioning on the NVIDIA T4.

The following figure illustrates the resources installed in each GPU-equipped Azure Stack Hub all-flash configuration scale unit node.

Partnership with AMD

We are also pleased to announce a partnership with AMD to deliver GPU capabilities in our Dell EMC Integrated System for Microsoft Azure Stack Hub. Available today, customers can order our dense scale unit configuration with AMD Radeon Instinct MI25 GPUs aimed at graphics intensive visualization workloads like simulation, CAD applications, and gaming. The MI25 uses GPU partitioning (GPU-P) technology to allow users of an Azure Stack Hub NVv4 VM to consume only a portion of the GPU’s resources based on their workload requirements.

The following table is a summary of our hardware acceleration capabilities.

An engineered approach

Following our stringent engineered approach, Dell Technologies goes far beyond considering GPUs as just additional hardware components in the Dell EMC Integrated System for Microsoft Azure Stack Hub portfolio. We apply our pedigree as leaders in appliance-based solutions to the entire lifecycle of all our scale unit configurations. The dense and all-flash scale unit configurations with integrated GPUs are designed to follow best practices and use cases specifically with Azure-based workloads, rather than workloads running on traditional virtualization platforms. Dell Technologies is also committed to ensuring a simplified experience for initial deployment, patch and update, support, and streamlined operations and monitoring for these new configurations.

Additional considerations

There are a couple of additional details worth mentioning about our new Azure Stack Hub dense and all-flash scale unit configurations with hardware acceleration:

The use of the GPU-backed N-Series VMs in Azure Stack Hub for compute-intensive AI and ML workloads is still in preview. Dell Technologies is very interested in speaking with customers about their use cases and workloads supported by this configuration. Please contact us at mhc.preview@dell.com to speak with one of our engineering technologists.
The Dell EMC Integrated System for Microsoft Azure Stack Hub configurations with GPUs can be delivered fully racked and cabled in our Dell EMC rack. Customers can also elect to have the scale unit components re-racked and cabled in their own existing cabinets with the assistance of Dell Technologies Services.

Resources for further study

At the time of publishing this blog post, only the NCv3 and NVv4 VMs are available in the Azure Stack Hub marketplace. The NCas_v4 currently is not visible in the portal. Please proceed to the Azure Stack Hub User Documentation for more information on these VM sizes.
Customers may want to explore the Train Machine Learning (ML) model at the edge design pattern in the Azure Hybrid Documentation. This may prove to be a good starting point for putting this technology to work for their organization.
Customers considering running AI and ML workloads on Dell EMC Integrated System for Microsoft Azure Stack Hub can also greatly benefit from storage-as-a-service with Dell EMC PowerScale. PowerScale can help enable faster training and validation of AI models, improve model accuracy, drive higher GPU utilization, and increase data science productivity. Visit Artificial Intelligence with Dell EMC PowerScale for more information.

Your Browser is Out of Date

Assets

Running containerized applications on Microsoft Azure's hybrid ecosystem - Introduction

Running containerized applications on Microsoft Azure’s hybrid ecosystem

Leveraging the hybrid approach

Dell Technologies First to Deliver Azure Stack HCI 23H2

What’s all the fuss about 23H2?

Dell Technologies is first out of the gate

Full stack lifecycle management

23H2 is only the beginning

Resources

The Evolution of Azure Stack HCI Lifecycle Management

The secret sauce

History lesson

Maintaining a Continuously Validated State

Release terminology

Engineering rigor produces stress-free LCM

Seeing is believing

Resources

Preview of Intelligent Automation in Dell APEX Cloud Platform for Microsoft Azure

Seeing is believing

The future’s so bright

Test Drive Azure Stack HCI in the Dell Demo Center!

Interactive Demo ITD-0910

Hands-on Lab HOL-0313-01

Other Opportunities to Get Hands-On

Deploy K8s clusters into Azure Stack Hub user subscriptions

Deploy K8s Cluster into an Azure Stack Hub user subscription

Exclusive Preview of Dell Azure Stack HCI Arc Integrated Configuration Compliance

Consider Azure Portal your observation deck.

Help prevent costly business setbacks.

Ready for your sneak peek?

Playing with the newest toys in your own sandbox and directly with the engineers creating the solution

Helping to make a cutting-edge technology even better with a vendor who is listening and responding to your feedback

Achieving superhero status at your business by automating routine administrative tasks that strengthen infrastructure integrity and improve operational efficiency

Experts Recommend Automation for a Healthier Lifestyle

Automation Triumphs!

Just when I thought it couldn’t get any better

Let’s invite OS updates to the party

Life is too short to wait for server reboots

Bake off rematch

Deploy a self-hosted Docker Container Registry on Azure Stack Hub

Deploy a self-hosted Docker Container Registry on Azure Stack Hub

Value Optimized AX-6515 for ROBO Use Cases

Introduction

Lab Setup

Testing Results

Optional modifications to the architecture

Boost Performance on Dell EMC HCI Solutions for Microsoft Server using Intel Optane Persistent Memory

GPU-Accelerated AI and ML Capabilities

Partnership with NVIDIA

Partnership with AMD

An engineered approach

Additional considerations

Resources for further study