Wed, 24 Apr 2024 12:57:21 -0000
|Read Time: 0 minutes
First for the “some history” part! Converged Infrastructure (CI) is not a new concept – it’s been with us for more than 10 years. Hey, one could even consider that we were the “inventors” of CI back in the days when we publicly announced it on November 03, 2009 (Press Release).
Figure 1. EMC Joe Tucci (center) unveils the Virtual Computing Environment coalition with VMware's Paul Maritz (left) and Cisco's John Chambers (right).
Nowadays the CI concept is well understood, but in 2009 it was kind of ground breaking since this approach had never been taken before.
“All datacenter requirements in just one system? How could that be possible?”
Those were the days of separated, disconnected, and siloed domains (compute, storage, networks) and CI was a new disruptive technology solution that would require a complete transformation in how IT would architect and consume datacenter infrastructure.
To have a clearer understanding of how Dell Technologies comprehends CI – then and now, we could define our CI end-to-end engineered turnkey system as:
VxBlock 1000, Industry-leading Converged Infrastructure, simplifies all aspects of IT by seamlessly integrating all the compute, network, storage and data protection and cloud management technologies you need into one engineered system. It is an all-in-one, “data center in a box.” You can offload the complexities and risks associated with managing enterprise-grade data center infrastructure so that your IT teams can confidently focus on higher-value activities. (from Top Reasons Why Organizations Choose VxBlock 1000 Converged Infrastructure).
For those new to VxBlock 1000, here are some of the most important values VxBlock has provided:
And what has happened during this 10+ year period?
Many things. Many milestones. Many systems sold. Many successful customer stories and projects that have led VxBlock to a very effective and consistent $1 billion annual run rate business just four years into its existence. Check the diagram below to reflect on some of the key milestones VxBlock and Dell Technologies CI as a whole have delivered during this decade.
Figure 2. VxBlock 1000 one-decade journey
Today, with more than 4500 systems installed in over 100 countries, VxBlock 1000 keeps on leading the way, innovating the CI arena in four key areas that address the second part of this blog post, namely “What’s New”:
VxBlock System 1000 gives you a choice of industry-leading technologies to meet the needs of all your different workloads, ranging from mission-critical, general purpose (virtualized or not), Artificial Intelligence/Machine Learning, End User Computing/Virtual Desktops… you name it!
Mix and match powerful Dell EMC storage and data protection options, Cisco UCS blade and rack servers, Cisco LAN and SAN networking, and VMware virtualization and cloud management. For more details on infrastructure see VxBlock 1000 data sheet and specs.
Since VxBlock 1000 is not just a reference architecture or a bill of materials, it eliminates the traditional risks associated with “Do It Yourself” approaches. It’s a fully integrated system that is engineered, manufactured, managed, supported, and sustained as one product, delivering a turnkey experience. Dell Technologies validates interoperability of components and provides a predictable system maintenance process that improves availability and productivity.
VxBlock 1000 leverages its deep VMware integration to simplify automation of everything from daily infrastructure provisioning tasks to delivery of IaaS and SaaS. At the foundation is VxBlock Central software that provides a single unified interface and access point for converged infrastructure operations.
Figure 3. VxBlock Central, a single pane of glass for management, automation, and LCM
VxBlock Central software dramatically simplifies daily administration by providing enhanced system-level awareness, automation and analytics, including launch points to:
As customers place workloads on top of VxBlock 1000, VxBlock Central helps to provide and maintain these services by managing the infrastructure underneath. See here for more great info about VxBlock Central Workflow Automation and the 40+ workflows available in the Workflow Automation Library.
Dell EMC CloudIQ for VxBlock features next-generation lifecycle management (LCM) that enables IT teams to more flexibly plan ahead and control converged hardware lifecycle, further reducing risk with proactive SaaS-based insights. You gain granular control over hardware inventory, milestones, support interoperability, and upgrade scenarios.
VxBlock 1000 is built with a perpetual design, meaning it will ensure that your system stays ready to support the introduction of next-generation technologies within any of the fundamental domains of the system, whether storage, compute, or network. You can address increased performance and scalability requirements while maximizing the return on your system investment.
Dell Technologies delivers fully integrated 24/7 support with a single call. There’s never any finger-pointing between vendors. You can always rely on our fully cross-trained team for a fast resolution to any problem. Our portfolio of services (including deployment services, migration services, and residency services) accelerates speed of deployment and integration into your IT environment. It also minimizes downtime by ensuring your software and hardware remains up to date throughout the product lifecycle.
One decade ago, Dell Technologies defined the foundations for CI and created a platform that has evolved to what today is VxBlock 1000. This system (compute, network, storage and management layer) is created (engineered and manufactured), maintained (single management and support), and sustained (in ongoing certified code upgrades) by Dell EMC during its entire journey. Customers simply take the keys of the car and drive.
Ignacio Borrero - LinkedIn, Twitter: @virtualpeli
Wed, 06 Mar 2024 15:29:18 -0000
|Read Time: 0 minutes
In September 2023, we officially released Dell APEX Cloud Platform for Microsoft Azure, the first offer in the market for Premier Solutions for Microsoft Azure Stack HCI.
Collaboratively built with Microsoft, this new platform extends and optimizes Azure Hybrid Cloud to on-premises, delivering three fundamental benefits:
Figure 1. Dell APEX Cloud Platform for Microsoft Azure Architecture
The innovation at Dell Technologies never stops. We are constantly developing and improving our products, and we have just launched our first update to the platform. Briefly, this release introduces and enhances:
Check out those updates in greater detail in these blogs: What's New with the Dell APEX Cloud Platform for Microsoft Azure March 2024 Release and Dell Technologies First to Deliver Azure Stack HCI 23H2.
In this blog, we want to put the spotlight on one particularly significant, useful, and easy to consume new capability – Event Monitoring for Dell APEX Cloud Platform for Microsoft Azure.
Dell APEX Cloud Platform for Microsoft Azure seamlessly integrates with Microsoft’s Azure Portal, providing the ability to monitor events generated on both Dell APEX Cloud Platform for Microsoft Azure hardware and the Cloud Platform Manager VM.
This new Insights for Azure Stack HCI monitor feature allows our customers to directly visualize in Azure Portal informational event data generated by the multicloud (MC) node hardware and the Cloud Platform Manager VM using an Insights integrated workbook.
With this workbook, we are empowering users to effectively manage and optimize their clusters and, in turn, receive the benefit of accelerated issue detection and time to resolution. I know, we’re excited too.
Not really. Simply follow these steps:
Figure 2. Enabling Event Monitoring in Azure portal
Boom. Done. That was easy, and now the workbook is enabled…what is next?
Once the page refreshes, you’ll be taken to the first of the two tabs of the workbook – the Overview tab – which provides a brief description of what this workbook is and the information it can provide to its users.
Figure 3. Event Monitoring for Dell APEX Cloud Platform Overview tab
The second tab in the workbook – the Health tab – presents a summary of the alerts or events that have occurred on the cluster, broken down into Warning, Critical, and Informational alerts.
The Health tab also provides a Nodes table with a high-level overview of each node for the selected time range, including which cluster it belongs to, the node name, health status, node state, uptime, and domain.
Figure 4. Event Monitoring for APEX Cloud Platform for Microsoft Azure Health tab
A second table – the Alerts table – shows each alert in greater detail, including its corresponding node, component and subcomponent, severity level, event code, product service tag number, reported time, a short description, and even a knowledgebase article for issue diagnosis and troubleshooting guidance.
Note that you can leverage the Search bar to filter the information based on a given search term and the Time Range drop-down menu to show the events that occurred on all the MC nodes for the cluster within a specific time range.
Our workbook, Event Monitoring for Dell APEX Cloud Platform for Microsoft Azure, makes real the ability to monitor events generated on both Dell APEX Cloud Platform for Microsoft Azure hardware and the Cloud Platform Manager within the Azure Portal.
This powerful integration provides a great deal of value, significantly reducing the issue detection time and time to resolution.
Thanks for reading, and stay tuned for more updates in Info Hub!
We have tons of great content to help you deep-dive into Dell APEX Cloud Platform for Microsoft Azure powered by Dell APEX Cloud Platform Foundation Software:
And as always, please reach out to your Dell Technologies account team if you would like to have more in-depth discussions about the Dell APEX Cloud Platforms family. If you don’t currently have a Dell Technologies contact, we’re here to help on our corporate website.
Author: Ignacio Borrero, Senior Principal Engineer, Technical Marketing Dell CI & HCI
@virtualpeli
Concept | Definition |
Dell APEX Cloud Platform for Microsoft Azure hardware | A turnkey on-premises infrastructure platform, collaboratively engineered between Dell Technologies and Microsoft to optimize Azure hybrid cloud operations. Based on multicloud (MC) nodes as the cluster(s) foundation. |
Cloud Platform Manager VM | Each cluster runs the Dell APEX Cloud Platform Foundation Software in a Cloud Platform Manager VM. This software is responsible for communicating with the underlying infrastructure and integrating automation workflows into Microsoft Windows Admin Center. |
Azure Workbook | A flexible canvas for data analysis and the creation of rich visual reports within the Azure portal. |
Thu, 15 Feb 2024 12:34:55 -0000
|Read Time: 0 minutes
On September 26, 2023, we introduced to the market the new Dell APEX Cloud Platform for Microsoft Azure. It is the first offer for Premier Solutions for Microsoft Azure Stack HCI, a new category in the Azure Stack HCI catalog reserved for key partners with the greatest levels of engagement with Microsoft and deepest integrations into familiar Microsoft management tools.
Dell APEX Cloud Platform for Microsoft Azure is a fully integrated infrastructure platform designed to optimize Microsoft Azure hybrid cloud deployments by optimizing operations, accelerating time-to-value across on-prem, edge, and Azure cloud deployments. It greatly simplifies initial deployments and on-going operations across the complete technology stack.
Security for Dell APEX Cloud Platform for Microsoft Azure is not an afterthought, but rather an integral part of the overall platform design process that leverages our Cyber Resilient Architecture and inherits Dell’s hardened server and software design to protect, detect, and recover from cyberattacks.
Full stack lifecycle management is key to maintaining a strong security posture throughout the life of your APEX Cloud Platforms, continuously and consistently applying Dell and Microsoft updates without risks to the platform and running workloads.
Dell APEX Cloud Platform for Microsoft Azure also leverages intrinsic infrastructure security management through Dell Infrastructure Lock and Secured-core server functionalities.
You can learn more on these platform features in this video.
Dell APEX Cloud Platform for Microsoft Azure takes full advantage of the security features that come with Azure Stack HCI:
Microsoft Defender for Cloud and Azure Policy assess, secure, and defend Dell APEX Cloud Platform for Microsoft Azure at-scale:
With this approach, the entire platform stack is covered – Azure Stack HCI, VMs, AKS hybrid workload cluster, and virtualized and cloud-native applications.
You can learn more on these platform features in this video.
If you want to go deeper and learn about all the different elements that come into play to properly guarantee the end to end secured and shielded protection for the platform, you can read our Dell APEX Cloud Platform for Microsoft Azure Security Configuration Guide, where we provide the configuration details for:
Dell APEX Cloud Platform for Microsoft Azure enhances Azure operations for edge and on-premises deployments by providing consistent management with centralized Azure tools while mitigating security and compliance risks with an intrinsic approach to security that extends Azure governance across all deployment environments.
Thanks for reading and… stay tuned for more updates in Info Hub!
Author: Ignacio Borrero, Senior Principal Engineer, HCI and Multicloud Technical Marketing
@virtualpeli
Wed, 30 Aug 2023 22:05:17 -0000
|Read Time: 0 minutes
The first half of 2023 has been quite prolific for the Dell Azure Stack HCI ecosystem, providing many important incremental updates in the platform. This article summarizes the most relevant changes inside the program.
Dell Integrated System for Microsoft Azure Stack HCI delivers a fully productized, validated, and supported hyperconverged infrastructure solution that enables organizations to modernize their infrastructure for improved application uptime and performance, simplified management and operations, and lower total cost of ownership. The solution integrates the software-defined compute, storage, and networking features of Microsoft Azure Stack HCI with AX nodes from Dell to offer the high-performance, scalable, and secure foundation needed for a software-defined infrastructure.
With Azure Arc, we can now unlock new hybrid scenarios for customers by extending Azure services and management to our HCI infrastructure. This allows customers to build, operate, and manage all their resources for traditional, cloud-native, and distributed edge applications in a consistent way across the entire IT estate.
A lot. There have been so many updates in the Azure Stack HCI front that it is difficult to detail all of them in just a single blog. So, let’s focus on the most important ones.
From a software and hardware perspective, the biggest change during the first half of 2023 was the introduction of Azure Stack HCI, version 22H2 (factory install and field support). The most important features in this release are Network ATC, GPU partitioning (GPU-P), and security improvements.
From a hardware perspective, these are the most relevant additions to the AX node family:
To better understand GPU-P and DDA, check this blog.
As the platforms mature, it is inevitable that some of the aging components are discontinued or replaced with newer versions. The most important changes on this front have been:
Finally, we have introduced a set of short and easily digestible training videos (seven minutes each, on average) to learn everything you need to know about Azure Stack HCI, from the AX platform and Microsoft’s Azure Stack HCI operating system, to the management tools and deploy/support services.
It’s certainly a challenge to synthesize the last six months of incredible innovation into a brief article, but we have highlighted the most important updates: focus on learning all the new Azure Stack HCI 22H2 features and updates, keep current with the hardware updates, and… stay tuned for important announcements by the last quarter of the year: big things are coming in Part 2!!!
Thank you for reading.
Author: Ignacio Borrero, Senior Principal Engineer, Technical Marketing
Tue, 25 Apr 2023 17:05:23 -0000
|Read Time: 0 minutes
It is incredible how time flies and it still feels like yesterday since December 10, 2020, when Microsoft initially released Azure Stack HCI.
Today, Azure Stack HCI is a huge success and, in combination with Azure Arc, the foundation for any real Microsoft hybrid strategy.
But believe it or not, 850+ days later, Azure Stack HCI is still a big unknown for part of the Microsoft community. In our daily customer engagements, we keep on observing that there are knowledge gaps around the Azure Stack HCI program itself and the partner ecosystem that surrounds it.
In these circumstances, we have decided to take action and create a very short and easy-to-follow video series explaining everything you need to know about Azure Stack HCI from a technical perspective.
This initial video training library consist of five videos, each averaging seven minutes in length. Here’s a summary of what you will discover in each of the videos:
Video: What Is Inside Azure Stack HCI
Learn the basics and fundamental components of Azure Stack HCI and get to know the Dell Integrated System for Microsoft Azure Stack HCI platform.
Meet the AX node platform and take the Dell Integrated System for Microsoft Azure Stack HCI route to deliver consistent Azure Stack HCI deployments.
Video: Topology and networking
Explore topology and network deployment options for Dell Integrated System for Microsoft Azure Stack HCI. Make good Azure Stack HCI environments even better with the Dell PowerSwitch family.
Learn about Azure Stack HCI local management with Windows Admin Center and OpenManage. This is the perfect combination for quick and easy controlled local deployments…and a solid foundation for true hybrid management.
Describes end to end deployment and support for the Dell Integrated System for Microsoft Azure Stack HCI platform with ProDeploy and ProSupport services.
Absolutely.
We are already working on the next series where we’ll be covering other important topics that are beyond the scope for this initial launch (such as best practices and stretched clusters).
There is no doubt that Azure Stack HCI is a very hot topic. In fact, it is the key foundational element that enables a true Microsoft hybrid strategy by delivering on-premises infrastructure fully integrated with Azure. This video series explains the different elements that make this possible.
All videos in the series are important, none should be skipped… but if there is one not to be missed, please, go for Dell Azure Stack HCI: Local Management. This topic is actually the hook for the next release (Hint -> Hybrid management is the next big thing!).
Thanks for reading and… stay tuned for additional videos on the Info Hub!
Author: Ignacio Borrero, Senior Principal Engineer, Technical Marketing Dell CI & HCI
Wed, 01 Feb 2023 15:50:35 -0000
|Read Time: 0 minutes
The end of 2022 brought us excellent news: Dell Integrated System for Azure Stack HCI introduced full support for GPU factory install.
As a reminder, Dell Integrated System for Microsoft Azure Stack HCI is a fully integrated HCI system for hybrid cloud environments that delivers a modern, cloud-like operational experience on-premises. It is intelligently and deliberately configured with a wide range of hardware and software component options (AX nodes) to meet the requirements of nearly any use case, from the smallest remote or branch office to the most demanding business workloads.
With the introduction of GPU-capable AX nodes, now we can also support more complex and demanding AI/ML workloads.
Not all AX nodes support GPUs. As you can see in the table below, AX-750, AX-650, and AX-7525 nodes running AS HCI 21H2 or later are the only AX node platforms to support GPU adapters.
Table 1: Intelligently designed AX node portfolio
Note: AX-640, AX-740xd, and AX-6515 platforms do not support GPUs.
The next obvious question is what GPU type and number of adapters are supported by each platform.
We have selected the following two NVIDIA adapters to start with:
The following table details how many GPU adapter cards of each type are allowed in each AX node:
Table 2: AX node support for GPU adapter cards
AX-750 | AX-650 | AX-7525 | |
---|---|---|---|
NVIDIA A2 | Up to 2 | Up to 2 | Up to 3 |
NVIDIA A30 | Up to 2 | -- | Up to 3 |
Maximum GPU number (must be same model) | 2 | 2 | 3 |
The NVIDIA A2 is the entry-level option for any server to get basic AI capabilities. It delivers versatile inferencing acceleration for deep learning, graphics, and video processing in a low-profile, low-consumption PCIe Gen 4 card.
The A2 is the perfect candidate for light AI capability demanding workloads in the data center. It especially shines in edge environments, due to the excellent balance among form factor, performance, and power consumption, which results in lower costs.
The NVIDIA A30 is a more powerful mainstream option for the data center, typically covering scenarios that require more demanding accelerated AI performance and a broad variety of workloads:
There are two GPU virtualization technologies in Azure Stack HCI: Discrete Device Assignment (also known as GPU pass-through) and GPU partitioning.
DDA support for Dell Integrated System for Azure Stack HCI was introduced with Azure Stack HCI OS 21H2. When leveraging DDA, GPUs are basically dedicated (no sharing), and DDA passes an entire PCIe device into a VM to provide high-performance access to the device while being able to utilize the device native drivers. The following figure shows how DDA directly reassigns the whole GPU from the host to the VM:
Figure 1: Discrete Device Assignment in action
To learn more about how to use and configure GPUs with clustered VMs with Azure Stack HCI OS 21H2, you can check Microsoft Learn and the Dell Info Hub.
GPU partitioning allows you to share a physical GPU device among several VMs. By leveraging single root I/O virtualization (SR-IOV), GPU-P provides VMs with a dedicated and isolated fractional part of the physical GPU. The following figure explains this more visually:
Figure 2: GPU partitioning virtualizing 2 physical GPUs into 4 virtual vGPUs
The obvious advantage of GPU-P is that it enables enterprise-wide utilization of highly valuable and limited GPU resources.
Note these important considerations for using GPU-P:
You’re probably wondering about Azure Virtual Desktop on Azure Stack HCI (still in preview) and GPU-P. We have a Dell Validated Design today and will be refreshing it to include GPU-P during this calendar year.
To learn more about how to use and configure GPU-P with clustered VMs with Azure Stack HCI OS 22H2, you can check Microsoft Learn and the Dell Info Hub (Dell documentation coming soon).
As of today, Dell Integrated System for Microsoft Azure Stack HCI only provides support for Azure Stack HCI OS 21H2 and DDA.
Full support for Azure Stack HCI OS 22H2 and GPU-P is around the corner, by the end of the first quarter, 2023.
The wait is finally over, we can now leverage in our Azure Stack HCI environments the required GPU power for AI/ML highly demanding workloads.
Today, DDA provides fully dedicated GPU pass-through utilization, whereas with GPU-P we will very soon have the choice of providing a more granular GPU consumption model.
Thanks for reading, and stay tuned for the ever-expanding list of validated GPUs that will unlock and enhance even more use cases and workloads!
Author: Ignacio Borrero, Senior Principal Engineer, Technical Marketing Dell CI & HCI
@virtualpeli
Mon, 30 May 2022 17:05:47 -0000
|Read Time: 0 minutes
Dell Hybrid Management: Azure Policies for HCI Compliance and Remediation
Companies that take an “Azure hybrid first” strategy are making a wise and future-proof decision by consolidating the advantages of both worlds—public and private—into a single entity.
Sounds like the perfect plan, but a key consideration for these environments to work together seamlessly is true hybrid configuration consistency.
A major challenge in the past was having the same level of configuration rules concurrently in Azure and on-premises. This required different tools and a lot of costly manual interventions (subject to human error) that resulted, usually, in potential risks caused by configuration drift.
But those days are over.
We are happy to introduce Dell HCI Configuration Profile (HCP) Policies for Azure, a revolutionary and crucial differentiator for Azure hybrid configuration compliance.
So, what is it? How does it work? What value does it provide?
Dell HCP Policies for Azure is our latest development for Dell OpenManage Integration with Windows Admin Center (OMIMSWAC). With it, we can now integrate Dell HCP policy definitions into Azure Policy. Dell HCP is the specification that captures the best practices and recommended configurations for Azure Stack HCI and Windows-based HCI solutions from Dell to achieve better resiliency and performance with Dell HCI solutions.
The HCP Policies feature functions at the cluster level and is supported for clusters that are running Azure Stack HCI OS (21H2) and pre-enabled for Windows Server 2022 clusters.
IT admins can manage Azure Stack HCI environments through two different approaches:
By using a single Dell HCP policy definition, both options provide a seamless and consistent management experience.
Running Check Compliance automatically compares the recommended rules packaged together in the Dell HCP policy definitions with the settings on the running integrated system. These rules include configurations that address the hardware, cluster symmetry, cluster operations, and security.
Dell HCP Policy Summary provides the compliance status of four policy categories:
To re-align non-compliant policies with the best practices validated by Dell Engineering, our Dell HCP policy remediation integration with WAC (unique at the moment) helps to fix any non-compliant errors. Simply click “Fix Compliance.”
Some fixes may require manual intervention; others can be corrected in a fully automated manner using the Cluster-Aware Updating framework.
The “Azure hybrid first” strategy is real today. You can use Dell HCP Policies for Azure, which provides a single-policy definition with Dell HCI Configuration Profile and a consistent hybrid management experience, whether you use Dell OMIMSWAC for local management or Azure Portal for management at-scale.
With Dell HCP Policies for Azure, policy compliance and remediation are fully covered for Azure and Azure Stack HCI hybrid environments.
You can see Dell HCP Policies for Azure in action at the interactive Dell Demo Center.
Thanks for reading!
Author: Ignacio Borrero, Dell Senior Principal Engineer CI & HCI, Technical Marketing
Twitter: @virtualpeli
Mon, 21 Feb 2022 17:45:58 -0000
|Read Time: 0 minutes
Global damages related to cybercrime were predicted to reach USD 6 trillion in 2021! This staggering number highlights the very real security threat faced not only by big companies, but also for small and medium businesses across all industries.
Cyber attacks are becoming more sophisticated every day and the attack surface is constantly increasing, now even including the firmware and BIOS on servers.
Figure 1: Cybercrime figures for 2021
However, this isn’t all bad news, as there are now two new technologies (and some secret sauce) that we can leverage to proactively defend against unauthorized access and attacks to our Azure Stack HCI environments, namely:
Let’s briefly discuss each of them.
Secured-core is a set of Microsoft security features that leverage the latest security advances in Intel and AMD hardware. It is based on the following three pillars:
Infrastructure lock provides robust protection against unauthorized access to resources and data by preventing unintended changes to both hardware configuration and firmware updates.
When the infrastructure is locked, any attempt to change the system configuration is blocked and an error message is displayed.
Now that we understand what these technologies provide, one might have a few more questions, such as:
In short, deploying these technologies is not an easy task unless you have the right set of tools in place.
This is when you’ll need the “secret sauce”— which is the Dell OpenManage Integration with Microsoft Windows Admin Center (OMIMSWAC) on top of our certified Dell Cyber-resilient Architecture, as illustrated in the following figure:
Figure 2: OMIMSWAC and Dell Cyber-resilient Architecture with AX Nodes
As a quick reminder, Windows Admin Center (WAC) is Microsoft’s single pane of glass for all Windows management related tasks.
Dell OMIMSWAC extensions make WAC even better by providing additional controls and management possibilities for certain features, such as Secured-core and Infrastructure lock.
Dell Cyber Resilient Architecture 2.0 safeguards customer’s data and intellectual property with a robust, layered approach.
Since a picture is worth a thousand words, the next section will show you what WAC extensions look like and how easy and intuitive they are to play with.
The following figure shows our Secured-core snap-in integration inside the WAC security blade and workflow.
Figure 3: OMIMSWAC Secured-core view
The OS Security Configuration Status and the BIOS Security Configuration Status are displayed. The BIOS Security Configuration Status is where we can set the Secured-core required BIOS settings for the entire cluster.
OS Secured-core settings are visible but cannot be altered using OMIMSWAC (you would directly use WAC for it). You can also view and manage BIOS settings for each node individually.
Figure 4: OMIMSWAC Secured-core, node view
Prior to enabling Secured-core, the cluster nodes must be updated to Azure Stack HCI, version 21H2 (or newer). For AMD Servers, the DRTM boot driver (part of the AMD Chipset driver package) must be installed.
The following figure illustrates the Infrastructure lock snap-in integration inside the WAC security blade and workflow. Here we can enable or disable Infrastructure lock to prevent unintended changes to both hardware configuration and firmware updates.
Figure 5: OMIMSWAC Infrastructure lock
Enabling Infrastructure lock also blocks the server or cluster firmware update process using OpenManage Integration extension tool. This means a compliance report will be generated if you are running a Cluster Aware Update (CAU) operation with Infrastructure lock enabled, which will block the cluster updates. If this occurs, you will have the option to temporarily disable Infrastructure lock and have it automatically re-enabled when the CAU is complete.
Dell understands the importance of the new security features introduced by Microsoft and has developed a programmatic approach, through OMIMSWAC and Dell’s Cyber-resilient Architecture, to consistently deliver and control these new features in each node and cluster. These features allow customers to always be secure and compliant on Azure Stack HCI environments.
Stay tuned for more updates (soon) on the compliance front, thank you for reading this far!
Ignacio Borrero, Senior Principal Engineer, Technical Marketing
Twitter: @virtualpeli
2020 Verizon Data Breach Investigations Report
2019 Accenture Cost of Cybercrime Study
Global Ransomware Damage Costs Predicted To Reach $20 Billion (USD) By 2021
Cybercrime To Cost The World $10.5 Trillion Annually By 2025
The global cost of cybercrime per minute to reach $11.4 million by 2021
Wed, 22 Sep 2021 18:17:41 -0000
|Read Time: 0 minutes
If history has taught us anything, it’s that disasters are always around the corner and tend to appear in any shape or form when they’re least expected.
To overcome these circumstances, we need the appropriate tools and technologies that can guarantee resuming operations back to normal in a secure, automatic, and timely manner.
Traditional disaster recovery (DR) processes are often complex and require a significant infrastructure investment. They are also labor intensive and prone to human error.
Since December 2020, the situation has changed. Thanks to the new release of Microsoft Azure Stack HCI, version 20H2, we can leverage the new Azure Stack HCI stretched cluster feature on Dell EMC Integrated System for Microsoft Azure Stack HCI (Azure Stack HCI).
The integrated system is based on our flexible AX nodes family as the foundation, and combines Dell Technologies full stack life cycle management with the Microsoft Azure Stack HCI operating system.
It is important to note that this technology is only available for the integrated system offering under the certified Azure Stack HCI catalog.
Azure Stack HCI stretch clustering provides an easy and automatic solution (no human interaction if desired) that assures transparent failovers of disaster-impacted production workloads to a safe secondary site.
It can also be leveraged to perform planned operations (such as entire site migration, or disaster avoidance) that, until now, required labor intensive and error prone human effort for execution.
Stretch clustering is one type of Storage Replica configuration. It allows customers to split a single cluster between two locations—rooms, buildings, cities, or regions. It provides synchronous or asynchronous replication of Storage Spaces Direct volumes to provide automatic VM failover if a site disaster occurs.
There are two different topologies:
Azure Stack HCI stretch clustering topologies: Active-Passive and Active-Active
To be truly cost-effective, the best data protection strategies incorporate a combination of different technologies (deduplicated backup, archive, data replication, business continuity, and workload mobility) to deliver the right level of data protection for each business application.
The following diagram highlights the fact that just a reduced data set holds the most valuable information. This is the sweet spot for stretch clustering.
For a real-life experience, our Dell Technologies experts put Azure Stack HCI stretched clustering to the test in the following lab setup:
Test lab cluster network topology
Note these key considerations regarding the lab network architecture:
For all the details, see this white paper: Adding Flexibility to DR Plans with Stretch Clustering for Azure Stack HCI.
In this blog though, I only want to focus on summarizing the results we obtained in our labs for the following four scenarios:
Scenario | Event | Simulated failure or maintenance event | Stretched Cluster expected response | Stretched Cluster actual response |
1 | Unplanned node failure | Node 1 in Site 1 power-down | Impacted VMs should failover to another local node | In around 5 minutes, all 10 VMs in Node 1 Site 1 fully restarted in Node 2 Site 1.
This is expected behavior since Site 1 has been configured as preferred site; otherwise, the active volume could have been moved to Site 2, and the VMs would have been restarted on a cluster node in Site 2. |
2 | Outage in Site 1 | Simultaneous power-down of Nodes 1 and 2 in site 1 | Impacted VMs should failover to nodes on the secondary site | In 25 minutes, all VMs were restarted, and the included web application was fully responsive.
The volumes owned by the nodes in Site 2 remained online throughout this failure scenario.
The replica volumes remained offline until Site 1 was restored to full health. Once Site 1 was back online, synchronous replication began again from the source volumes in Site 2 to their destination replica partners in Site 1. |
3 | Planned failover | Switch Direction operation on a volume from Windows Admin Center | Selected VMs and workloads should transparently move to secondary site | Within 0 to 3 mins, the application hosted by the affected VMs was reachable without service interruption (time depends on whether IP reassignment is required).
First, the owner node for the volumes changed to Node 2 in Site 2, and owner node for the replica volumes changed to Node 2 in Site 1. No service interruption. At this time, the test VM was running in Site 1, but its virtual disk that resided on the volume was running in Site 2. Performance problems can result because I/O is traversing the replication links across sites. After approximately 10 minutes, a Live Migration of the test VM would occur automatically (if not manually initiated earlier) so that the VM would be on the same node as its virtual disk. |
4 | Lifecycle management | Update all nodes in the cluster by using Single-click Full Stack Cluster Aware Updating (CAU) in Windows Admin Center | Stretched cluster and CAU should work seamlessly together to provide full stack cluster update without service interruption and local only workload mobility for the Live Migrated VMs | The total process of applying the operating system and firmware updates to the stretched cluster took approximately 3 hours, and the process had no application impact.
Each node was drained, and its VMs were live migrated to the other node in the same site. The intersite links between Site 1 and Site 2 were never used during update operations. In addition, the process required only a single reboot per node. This behavior was consistent throughout the update of all the nodes in the stretched cluster. |
To sum up, Azure Stack HCI Stretch Clustering has been shown to work as expected under difficult circumstances. It can easily be leveraged to cover a wide range of data protection scenarios, such as:
This technology may make the difference for businesses to automatically stand up after disaster strikes, a total game changer in the automatic disaster recovery landscape.
Thank you for your time reading this blog and don’t forget to check out the full white paper!!!
Wed, 22 Sep 2021 18:15:33 -0000
|Read Time: 0 minutes
We are happy to announce the latest members of the family for our Microsoft HCI Solutions from Dell Technologies: the new AX-650 and AX-750 nodes.
If you are already familiar with our existing integrated system offering, you can directly jump to the next section. For those new to the party, keep on reading!
Figure 1: Dell EMC Integrated System for Microsoft Azure Stack HCI portfolio: New AX-650 and AX-750 nodes
As with all other nodes supported by Dell EMC Integrated System for Microsoft Azure Stack HCI, the AX-650 and AX-750 nodes have been intelligently and deliberately configured with a wide range of component options to meet the requirements of nearly any use case – from the smallest remote or branch office to the most demanding database workloads.
The chassis, drive, processor, DIMM module, network adapter, and their associated BIOS, firmware, and driver versions have been carefully selected and tested by the Dell Technologies engineering team to optimize the performance and resiliency of Azure Stack HCI. Our engineering has also validated networking topologies using PowerSwitch network switches.
Arguably the most compelling aspect of our integrated system is our life cycle management capability. The Integrated Deploy and Update snap-in works with the Microsoft cluster creation extension to deliver Dell EMC HCI Configuration Profile. This Configuration Profile ensures a consistent, automated initial cluster creation experience on Day 1. The one-click full stack life cycle management snap-in for the Microsoft Cluster-Aware Updating extension allows administrators to apply updates. This seamlessly orchestrates OS, BIOS, firmware, and driver updates through a common Windows Admin Center workflow.
On top of it, Dell Technologies makes support services simple, flexible, and worry free – from installation and configuration to comprehensive, single source support. Certified deployment engineers ensure accuracy and speed, reduce risk and downtime, and free IT staff to work on those higher value priorities. Our one-stop cluster level support covers the hardware, operating system, hypervisor, and Storage Spaces Direct software, whether you purchased your license from Dell EMC or from Microsoft.
Now that we are at the same page with our integrated system…
AX-650 and AX-750 are based on Intel Xeon Scalable 3rd generation Ice Lake processors that introduce big benefits in three main areas:
Customers always demand the highest levels of performance available, and our new 15G platforms, through Intel Ice Lake and its latest 10nm technology, deliver huge performance gains (compared to the previous generation) for:
These impressive figures are a big step forward from a hardware boost perspective, but there are even more important things going on than just brute power and performance.
Our new 15G platforms lay the technology foundation for the latest features that are coming (really) soon with the new version of Microsoft Azure Stack HCI.
Windows Server 2022 and Azure Stack HCI, version 21H2 will bring in (when they are made available) the following two key features:
The fundamental idea of Secured-core Server is to stay ahead of attackers and protect our customers’ infrastructure and data all through hardware, BIOS, firmware, boot, drivers, and the operating system. This idea is based on three pillars:
For more details about Secured-core Server, click here.
Figure 2: Secured-core Server with Windows Admin Center integration
AX-650, AX-750, and AX-7525 are the first AX nodes to introduce GPU readiness for single-width and double-width GPUs.
With the September 21, 2021 launch, all configurations planned to support GPUs are already enabled in anticipation for the appropriate selection of components (such as GPU risers, power supplies, fans, and heatsinks).
This process permits the GPU(s) to be added later on (when properly validated and certified) as an After Point of Sale (APOS).
The first GPU that will be made available with AX nodes (AX-650, AX-750, and AX-7525) is the NVIDIA T4 card.
To prepare for this GPU, customers should opt for the single-width capable PCI riser.
The following table shows the maximum number of adapters per platform taking into account the GPU form factor:
| AX-750 | AX-650 | AX-7525 | |||
| Single width | Dual width | Single width | Dual width | Single width | Dual width |
All SSD | Up to 31 | Up to 2 | Up to 22 | N/A |
| |
All NVMe | Up to 31 | Up to 2 | Up to 22 | N/A | Up to 33 | Up to 33 |
NVMe+SSD |
| Up to 4 | Up to 3 |
1 Max of 3 factory installed with Mellanox NIC adapters. Exploring options for up to 4 SW GPUs
2 Depending on the number of RDMA NICs
3 Only with the x16 NVMe chassis. x24 NVMe chassis does not support any GPUs
Note that no GPUs are available at the September 21, 2021 launch. GPUs will not be validated and factory installable until early 2022.
Dell EMC OpenManage Integration with Microsoft Windows Admin Center (OMIMSWAC) extension was launched in 2019.
It has included hardware and firmware inventory, real time health monitoring, iDRAC integrated management, troubleshooting tools, and seamless updates of BIOS, firmware, and drivers.
In the 2.0 release in February 2020, we also added single-click full stack life cycle management with Cluster-Aware Updating for the Intel-based Azure Stack HCI platforms. This allowed us to orchestrate OS, BIOS, firmware, and driver updates through a single Admin Center workflow, requiring only a single reboot per node in the cluster and resulting in no interruption to the services running in the VMs.
With the Azure Stack HCI June 2021 release, the OpenManage Integration extension added support for the AX-7525 and AX-6515 AMD based platforms.
Now, with the September 21, 2021 launch, OMIMSWAC 2.1 features a great update for AX nodes, including these important extensions:
Integrated Deploy & Update deploys Azure Stack HCI with Dell EMC HCI Configuration Profile for optimal cluster performance. Our integration also adds the ability to apply hardware solution updates like BIOS, firmware, and drivers at the same time as operating system updates as part of cluster creation with a single reboot.
With CPU Core Management, customers can dynamically adjust the CPU core count BIOS settings without leaving the OpenManage Integration extension in Windows Admin Center, helping to maintain the right balance between cost and performance.
Cluster Expansion helps to prepare new cluster nodes before adding them to the cluster, to significantly simplify the cluster expansion process, reduce human error, and save time.
Figure 3: CPU Core Management and Cluster Expansion samples
In conclusion, the AX-650 and AX-750 nodes establish the most performant and easy to operate foundation for Azure Stack HCI today, along with all the new features and goodness that Microsoft is preparing. Stay tuned for more news and updates on this front!
Ignacio Borrero, @virtualpeli
Mon, 26 Jul 2021 12:46:04 -0000
|Read Time: 0 minutes
In August 31, 2017, Microsoft launched Azure Stack Hub and enabled a true hybrid cloud operating model to extend Azure services on-premises. An awesome and long expected milestone at that time!
Implementing Azure Stack Hub in our customers’ datacenters under normal circumstances is a pretty straightforward process today if you choose our Dell EMC Integrated System for Microsoft Azure Stack Hub.
But there are certain cases where delivering Azure Stack Hub may be complex (or even impossible), especially in scenarios such as:
The final outcome in these environments remains the same: provide always-on cloud services everywhere from a minimal set of local resources.
The question is… how do we make this possible?
Dell Technologies, in partnership with Microsoft and Tracewell Systems, has developed Dell EMC Integrated System for Microsoft Azure Stack Hub – Tactical (aka Azure Stack Hub – Tactical): a unique ruggedized and field-deployable solution for Azure Stack tactical edge environments.
Azure Stack Hub – Tactical extends Azure-based solutions beyond the traditional data center to a wide variety of non-standard environments, providing a local Azure consistent cloud with:
Azure Stack Hub – Tactical is functionally and electrically identical to Azure Stack Hub All-Flash to ensure interoperability. It includes custom engineered modifications to make the whole solution fit into just three ruggedized cases that are only23.80 inches wide, 41.54 inches high, and 25.63 inches deep.
The smallest Azure Stack Hub – Tactical configuration comprises one management case plus two compute cases, each of them containing:
2 x T-R640 servers, based on Dell EMC PowerEdge R640 All-Flash server adapted for tactical use (2U each)
Two configuration options for compute servers:
Compute cases can grow up to 8, for a total of 16 servers (in 4-node increments) -- the scale unit maximum mandated by Microsoft.
You can read the full specifications here.
Azure Stack Hub – Tactical is a turnkey end to end engineered solution designed, tested, and sustained through the entire lifespan of all of its hardware and software components.
It includes non-disruptive operations and automated full stack life cycle management for on-going component maintenance, fully coordinated with Microsoft’s Update process.
Customers also benefit from a simplified one call support model across all solution components.
Desperate “edge-cuts” must have desperate “tactical-cures”, and that is exactly what Dell EMC Integrated System for Microsoft Azure Stack Hub – Tactical delivers to our customers for edge environments and extreme conditions.
Azure Stack Hub – Tactical resolves the challenges of providing Azure cloud services everywhere by allowing our customers to add/remove deployments with relative ease through an automated, repeatable, and predictable process requiring minimal local IT resources.
Thanks for reading and stay tuned for more blog updates in this space by visiting Info Hub!
Wed, 16 Jun 2021 13:35:49 -0000
|Read Time: 0 minutes
Dell EMC Integrated System for Microsoft Azure Stack HCI is a fully integrated HCI system for hybrid cloud environments that delivers a modern, cloud-like operational experience on-premises from a mature market leader.
The integrated system is based on our flexible AX nodes family as the laying foundation, and combines Dell Technologies full stack life cycle management with the Microsoft Azure Stack HCI operating system.
This blog focuses on one of the most important and critical parts of Azure Stack HCI: the management layer. Check this blog for additional background.
We will show how at Dell Technologies we make the good - Microsoft Windows Admin Center (WAC) - even better, through our OpenManage Integration with Microsoft Windows Admin Center v2.0 (OMIMSWAC).
The following diagram illustrates a typical Dell Technologies Azure Stack HCI setup:
To learn more about Microsoft HCI Solutions from Dell Technologies and get details on each of the different components, check out this video where our Dell Technologies experts examine the solution thoroughly from the bottom up.
WAC provides the option to leverage easy-to-use workflows to perform many tasks, including automatic deployments (coming soon) and updates.
Dell Technologies has developed specialized snap-ins that integrate OpenManage with WAC to further extend the capabilities of Microsoft’s WAC extensions.
The following table describes the three key elements highlighted in the previous diagram as (1), (2), and (3). We examine each in detail in the next three sections.
Item | Type | Integrates with | Developed by | Description |
---|---|---|---|---|
Microsoft Cluster Aware Updating extension Microsoft Failover Cluster Tool Extension 1.250.0.nupkg release* * Min version validated | Extension | WAC | Microsoft | WAC workflow to apply cluster aware OS updates |
Dell EMC Integrated Full Stack Cluster Aware Updating | Integration | Microsoft CAU extension | Dell Technologies | Integration snap-in to main CAU workflow to provide BIOS, firmware and driver updates while performing OS updates |
OMIMSWAC v2.0 Standalone extension | Extension | WAC | Dell Technologies | OpenManage WAC extension for Infrastructure Life cycle management, plus cluster monitoring, inventory and troubleshooting |
Cluster Creation extension Microsoft Cluster Creation Extension 1.529.0.nupkg release* * Min version validated | Extension | WAC | Microsoft | WAC workflow to create Azure Stack HCI Clusters |
Integrated Deployment and Update (coming soon) | Integration | Microsoft IDU extension | Dell Technologies | Integration snap-in to main Cluster Creation workflow to provide BIOS, firmware and driver updates during the cluster creation process |
Windows Admin Center extensions and integrations
You can install Microsoft Cluster Aware Updating extension within WAC by selecting the “Gear” icon on the top right corner, then under “Gateway”, navigate to “Extensions”. Under “Available extensions”, find the desired extension and select “Install”. For details, see the install guide. Please refer to the extensions product documentation for the latest updates.
To get to Microsoft WAC Azure Stack HCI Cluster Aware Updating extension, login to WAC and follow these steps:
It is important to note that you can select either to run only one operation at a time by skipping the other or run both in one single process and reboot.
You may select, if available, any Operating system update and click “Next: Hardware updates”.
This takes us to the second step of the sequence - Hardware updates - a key phase for the automated end-to-end cluster aware update process.
This is where the Dell Technologies snap-in integrates with Microsoft’s original workflow, allowing us to seamlessly provide automated BIOS, firmware, and driver updates (and OS updates if also selected) to all the nodes in the cluster with a single reboot. Let’s look at this process in detail in the next section.
Once you click “Next: Hardware updates” on the original Microsoft’s Azure Stack HCI Cluster Aware Updating workflow, you are taken to Dell EMC Cluster Aware Updating integration.
If the integration is not installed, there is an option to install it from inside the workflow.
Click “Get updates”.
Our snap-in for Cluster Aware Updating (CAU) takes us through the following sequence of five steps.
1. Prerequisites (screenshot above)
A validation process occurs, checking that all AX nodes are:
Click “Next: Update source”.
2. Update source
Here we can select the source for our BIOS, firmware, and driver repository, whether online [Update Catalog for Microsoft HCI Solutions] or offline (edge or disconnected) [Dell EMC Repository Manager Catalog]. Dell Technologies has created and keeps these solution catalogs updated.
Click “Next: Compliance report”.
3. Compliance report
Now we can check how compliant our nodes are and select for BIOS, firmware, and/or driver remediation. All the recommended components are selected by default.
The compliance operation runs in parallel for all nodes, and the report is shown consolidated across nodes.
Click “Next: Summary”.
4. Summary
All selections from all nodes are shown in Summary for review before we click “Next: Download updates”.
5. Download updates
This window provides the statistics regarding the download process (start time, download status).
When all downloads are completed, we can click “Next: Install”, which takes us back again to Step 3 of the main workflow (“Install”), to begin the installation process of OS and hardware updates (if both were selected) on the target nodes.
If any of the updates requires a restart, servers will be rebooted one at a time, moving cluster roles such as VMs between servers to prevent downtime and guaranteeing business continuity.
Once the process is finished for all the nodes, we can go back to “Updates” to check for the latest update status and/or Update history for previous updates.
It is important to note that the Cluster Aware Updating extension is supported only for Dell EMC Integrated System for Microsoft Azure Stack HCI.
The standalone extension applies to Windows Server HCI and Azure Stack HCI, and continues to provide monitoring, inventory, troubleshooting, and hardware updates with CAU.
New to OMIMSWAC 2.0 is the option to schedule updates during a programmed maintenance window for greater flexibility and control during the update process.
It is important to note that OMIMSWAC Standalone version provides the Cluster Aware Updating feature for the hardware (BIOS, firmware, drivers) in a single reboot, although this process is not integrated with operating system updates. It provides full lifecycle management just for the hardware, not the OS layer.
Another key takeaway is that OMIMSWAC Standalone version fully supports Dell EMC HCI Solutions from Microsoft Windows Server and even certain qualified previous solutions (Dell EMC Storage Spaces Direct Ready Nodes).
Dell Technologies has developed OMIMSWAC to make integrated systems’ lifecycle management a seamless and easy process. It can fully guarantee controlled end-to-end cluster hardware and software update processes during the lifespan of the service.
The Dell EMC OMIMSWAC automated and programmatic approach provides obvious benefits, like mitigating risk caused by human intervention, significantly fewer steps to update clusters, and significantly less focused attention time for IT administrators. In small 4-node cluster deployments, this can mean up to 80% fewer steps and up to 90% less focused attention from an IT operator.
Full details on the benefits of performing these operations automatically through OMIMSWAC versus doing it manually are explained in this white paper.
Thank you for reading this far and stay tuned for more blog updates in this space!
Wed, 16 Jun 2021 13:35:49 -0000
|Read Time: 0 minutes
Dell EMC Integrated System for Microsoft Azure Stack HCI (Azure Stack HCI) is a fully productized HCI solution based on our flexible AX node family as the foundation.
Before I get into some exciting performance test results, let me set the stage. Azure Stack HCI combines the software-defined compute, storage, and networking features of Microsoft Azure Stack HCI OS, with AX nodes from Dell Technologies to deliver the perfect balance for performant, resilient, and cost-effective software-defined infrastructure.
Figure 1 illustrates our broad portfolio of AX node configurations with a wide range of component options to meet the requirements of nearly any use case – from the smallest remote or branch office to the most demanding database workloads.
Figure 1: current platforms supporting our Microsoft HCI Solutions from Dell Technologies
Each chassis, drive, processor, DIMM module, network adapter and their associated BIOS, firmware, and driver versions have been carefully selected and tested by the Dell Technologies Engineering team to optimize the performance and resiliency of Microsoft HCI Solutions from Dell Technologies. Our Integrated Systems are designed for 99.9999% hardware availability*.
* Based on Bellcore component reliability modeling for AX-740xd nodes and S5248S-ON switches a) in 2- to 4-node clusters configured with N + 1 redundancy, and b) in 4- to 16-node clusters configured with N + 2 redundancy, March 2021.
Comprehensive management with Dell EMC OpenManage Integration with Windows Admin Center, rapid time to value with Dell EMC ProDeploy options, and solution-level Dell EMC ProSupport complete this modern portfolio.
You'll notice in that table that we have a new addition -- the AX-7525: a dual-socket, AMD-based platform designed for extreme performance and high scalability.
The AX-7525 features direct-attach NVMe drives with no PCIe switch, which provides full Gen4 PCIe potential to each storage device, resulting in massive IOPS and throughput at minimal latency.
To get an idea of how performant and resilient this platform is, our Dell Technologies experts put a 4-node AX-7525 cluster to the test. Each node had the following configuration:
The easy headline would be that this setup consistently delivered nearly 6M IOPs at sub 1ms latency. One could think that we doctored these performance tests to achieve these impressive figures with just a 4-node cluster!
The reality is that we sought to establish the ‘hero numbers’ as a baseline – ensuring that our cluster was configured optimally. However, we didn’t stop there. We wanted to find out how this configuration would perform with real-world IO patterns. This blog won’t get into the fine-grained details of the white paper, but we’ll review the test methodology for those different scenarios and explain the performance results.
Figure 2 shows the 4-node cluster and fully converged network topology that we built for the lab:
Figure 2: Lab setup
We performed two differentiated sets of tests in this environment:
To generate real-life workloads, we used VMFleet, which leverages PowerShell scripts to create Hyper-V virtual machines executing DISKSPD to produce the desired IO profiles.
We chose the three-way mirror resiliency type for the volumes we created with VMFleet because of its superior performance versus erasure coding options in Storage Spaces Direct.
Now that we have a clearer idea of the lab setup and the testing methodology, let’s move on to the results for the four tests.
Here are the details of the workload profile and the performance we obtained:
IO profile | Block size | Thread count | Outstanding IO | Write % | IO pattern | Total IOs | Latency |
B4-T2-O32-W0-PR | 4k | 2 | 32 | 0% | 100% random read | 5,727,985 | 1.3 ms (read) |
B4-T2-O16-W100-PR | 4k | 2 | 16 | 100% | 100% random write | 700,256 | 9 ms* (write) |
|
|
|
|
|
| Throughput | |
B512-T1-O8-W0-PSI | 512k | 1 | 8 | 0% | 100% sequential read | 105 GB/s | |
B512-T1-O1-W100-PSI | 512k | 1 | 1 | 100% | 100% sequential write | 8 GB/s |
* The reason for this slightly higher latency is because we are pushing too many Outstanding IOs and we already plateaued on performance. We noticed that even with 32 VMs, we hit the same IOs, because all we are doing from that point on is adding more load that a) isn’t driving any additional IOs and b) just adds to the latency.
This test sets the bar for the limits and maximum performance we can obtain from this 4-node cluster: almost 6 million read IOs, 700k write IOs, and a bandwidth of 105 GB/s for reads, and 8 GB/s for writes.
The IO profiles for this test encompass a broad range of real-life scenarios:
The following figure shows the details and results we obtained for all the different tested IO patterns:
Figure 3: Test 2 results
Super impressive results and important to notice (on the left) the 1.6 million IOPS at 1.2 millisecond average latency for the typical OLTP IO profile of 8 KB block size and 30% random write. Even at 32k block size and 50% write IO ratio, we measured 400,000 IOs at under 7 milliseconds latency.
Also, very remarkable is the extreme throughput we witnessed during all the tests, with special emphasis on the incredible 29.65 GB/s with an IO profile of 512k block size and 20% write ratio.
To simulate a one-node failure (Test 3), we shut down node 4, which caused node 2 to take additional ownership of the 32 restarted VMs from node 4, for a total of 64 VMs on node 2.
Similarly, to simulate a two-node failure (Test 4), we shut down nodes 3 and 4, leading to a VM reallocation process from node 3 to node 1, and from node 4 to node 2. Nodes 1 and 2 ended up with 64 VMs each.
The cluster environment continued to produce impressive results even in this degraded state. The table below compares the testing scenarios that used IO profiles aimed at identifying the maximum thresholds.
IO profile | Healthy cluster | One node failure | Two node failure | |||
Total IOs | Latency | Total IOs | Latency | Total IOs | Latency | |
B4-T2-O32-W0-PR | 4,856,796 | 0.38 ms (read) | 4,390,717 | 0.38 ms (read) | 3,842,997 | 0.26 ms (read) |
B4-T2-O16-W100-PR | 753,886 | 3.2 ms (write) | 482,715 | 5.7 ms (write) | 330,176 | 11.4 ms (write) |
| Throughput | Throughput | Throughput | |||
B512-T1-O8-W0-PSI | 91 GB/s | 113 GB/s | 77 GB/s | |||
B512-T1-O1-W100-PSI | 8 GB/s | 6 GB/s | 10 GB/s |
Figure 4 illustrates the test results for real-life workload scenarios for the healthy cluster and for the one-node and two-node degraded states.
Figure 4: Test 3 and 4 results
Once more, we continued to see outstanding performance results from an IO, latency, and throughput perspective, even with one or two nodes failing.
One important consideration we observed is that for the 4k and 8k block sizes, IOs decrease and latency increases as one would expect, whereas for the 32k and higher block sizes we realized that:
There are two reasons for this:
We are happy to share with you these figures about the extreme-resilient performance our integrated systems deliver, during normal operations or in the event of failures.
Dell EMC Integrated System for Microsoft Azure Stack HCI, especially with the AX-7525 platform, is an outstanding solution for customers struggling to support their organization’s increasingly heavy demand for resource intensive workloads and to maintain or improve their corresponding service level agreements (SLAs).