PowerEdge MX Validate Baseline to Improve Operational Efficiency
Download PDFMon, 16 Jan 2023 21:29:16 -0000
|Read Time: 0 minutes
Summary
Modern compute platforms consist of many components requiring multiple firmware elements. This can lead to complexity and risk when updating these components. To eliminate this problem for MX customers, Dell produces a biennial firmware baseline and validates the complete end-to-end stack with testing built on real customer use cases. Dell OpenManage system management orchestration then offers a simple route to update, at scale, live environments to this desired state.
This Direct from Development (DfD) tech note describes at a high level the Dell methodology for applying updates with no disruption in service. This enables lowering risk, streamlining the update process, and saving time for organizations.
Market positioning
The PowerEdge MX is a scalable modular platform comprising compute, networking, and storage elements, and designed for data center consolidation with easy deployment and rich integrated management. PowerEdge MX features an industry- leading no midplane design and scalable network fabric, within a chassis architecture to support today’s emerging processor technologies, new storage technologies, and new connectivity innovations well into the future.
PowerEdge MX firmware baseline
Reduce complexity and simplify operations by leveraging Dell’s MX validated solution infrastructure firmware baseline. This is a set of system and component firmware for
the MX platform that is rigorously tested as “one release” in a number of configurations, using the most popular operating system environments based on real world customer use cases. When the updates have passed this testing as a group, a validated solution stack firmware catalog that details the release versions is published. Several solutions in the OpenManage portfolio can then consume the catalog as an update blueprint.
Figure 1. MX Baseline Components
Dell MX firmware baselines offer customers an elegant and automated method for platform wide updates. Advantages for customers include:
- Aggregates multiple releases into one consolidated update
- Dell end-to-end validation helps eliminate the risk of element incompatibility
- Reduces the number of maintenance windows and the amount downtime required for updating
Anatomy of the PowerEdge MX baseline
The PowerEdge MX validated solution baseline consists of many elements, including system BIOS, iDRAC, NICs, CNAs, fibre channel adapters, HBAs and other critical updates. In addition, the stack extends into the chassis to include network switch code and management controller software “OME-M”. The MX platform baseline testing includes the Chassis I/O Modules such as MX9116n, MX7116n, MX5108, and MXG610 capabilities in all forms with scaled VLANs. It also includes testing with different configurations, protocols, and workloads. For Fiber Channel and FCoE, baseline testing also includes testing scenarios in NPIV Proxy Gateway, FIP Snooping Bridge, and Direct Attached mode. An example end-to-end stack test is VMWare ESXi running on the compute sleds connected to a PowerStore storage array using FCoE Ethernet and testing updating from an old baseline to the new baseline. When the Dell updates pass evaluation, a validated solution stack of the platform firmware catalog file containing details of the tested versions is published online ready to be consumed by Dell update mechanisms, such as the update manager integrated into OME. Think of the validated baseline as a recipe for success.
When it comes to apply updates, Dell’s OpenManage system management automation provides a timesaving centralized process with intelligent safeguards to eliminate downtime. The benefits of using OME-M to perform updates using the catalog include: automatically identifying components that require updates, downloading the updates from the Dell support site, creating and scheduling update jobs, correctly ordering tasks, and reporting. The following example shows a sample catalog, highlighting the non-compliant elements. An administrator needs only to click the “Make Compliance” to start the task to update multiple elements in the MX environment.
Figure 2. Detailed view of a firmware update
VMware enhancement
For customers running their VMware environment on PowerEdge MX platform, this firmware update process can be enhanced using OMEVV (OpenManage Enterprise plugin for VMware vCenter) to be “VMware cluster” aware, in order to safeguard services from outages. Cluster aware updates mean intelligent rules that allow patching only one member of a VMware cluster at a time. Leveraging ESXi maintenance mode, DRS, and vMotion, before patching a physical host,
virtual machines are systematically migrated “hot” to other ESXi hosts, ensuring that workloads and services running on the cluster are kept online at all times. After applying the updates, the host restarts and re-joins the cluster. DRS can then live migrate virtual machines back to the newly updated host. This sequence is repeated for each host in the cluster, offering a controlled rolling upgrade for the entire cluster.
Figure 3. OMEVV/VMware host rolling updates
OMEVV also includes a scheduling engine to manage timed updates during quiet periods or to set maintenance windows. Larger customers can run parallel updates on up to 15 clusters simultaneously from a single console.
IOMs
If a customer is using an MX environment with MX9116n/MX7116n network switches in SmartFabric mode, they simply select “make compliant” from the OME-M GUI. No searching for the correct switch code, no manual upload code to the switch, it is all taken care of as part of the catalog. OME-M interfaces with switches to upload the new code. If the switches are configured as a pair, the update runs automatically on one switch at a time to ensure problem free connectivity during the updates.
RESTful API
The OpenManage Enterprise APIs enable the customer to integrate with other management products such as Ansible play books or build tools based on common programming and scripting languages, including Python and PowerShell. These APIs are fully documented. Dell posts many examples on GitHub code repository for administrators / developers to download and use for free.
In Conclusion
Customers who rely on Dell PowerEdge MX for their compute needs can streamline the update process, saving time and ensuring firmware compliance, by leveraging MX validated solution stack firmware baselines. In addition, for VMware environments, intelligent rolling firmware updates for hosts offer updating with zero service outages, and no end user downtime.
References
To learn more, see:
Related Documents
Cut server migration times by upgrading to Dell PowerEdge MX from legacy Cisco UCS
Thu, 30 Mar 2023 15:55:57 -0000
|Read Time: 0 minutes
Principled Technologies testing showed customers can save significant administrator time and effort by migrating legacy UCS workloads to Dell PowerEdge MX rather than Cisco UCS X.
By requiring 246 fewer administrator steps from initial configuration through server migration for a three-node cluster, choosing the Dell PowerEdge MX platform could help reduce human error and possible troubleshooting time as you move your new hardware into production. With PowerEdge MX, administrators save time as well—2 hours and 21 minutes for a three-node cluster—compared to moving to new Cisco UCS hardware. That is time that administrators can spend working on new initiatives to further their business goals.
Read the report here!
Unlock New MX CPU and Storage Configurations with a Thermally Optimized Air-Cooled Chassis
Fri, 03 Mar 2023 20:08:02 -0000
|Read Time: 0 minutes
As the server industry trend of increasing CPU power goes on, Dell Technologies continues to offer customers feature-rich air-cooled configurations. Dell Engineering has applied thermal innovation and machine learning to the Dell PowerEdge MX chassis to support the MX760c server sled with a broad range of 4th Gen Intel® Xeon® Scalable processors and local storage configurations.
This Direct from Development tech note describes the new capabilities using air cooling that Dell has added to the PowerEdge MX configurations.
Introduction
The PowerEdge MX7000 is a modular chassis that allows customers to build a set of compute, storage, networking, and management to meet their specific workload needs. Industry trends of new technologies, including CPUs increasing power per server sled, continually push the capability to air-cool feature-rich configurations. Dell Engineering used machine learning combined with next-generation fans to offer high-performance 4th Gen Intel® Xeon® Scalable processors in an air-cooled chassis with more local storage configurations than previously available.
Dell Engineering expertise
There are 8! = 40,320 modular sled permutations in the 8-slot MX chassis. Dell Engineering conducted a Design of Experiments (DOE) to train a machine learning model that dynamically calculates the airflow cooling capacity for each of the eight slots. This technology enables Dell to maximize the shared cooling infrastructure of the MX7000, unlocking configurations that were previously not possible, and provide clear guidance to customers about how to thermally optimize their chassis. When a chassis configuration is optimized for cooling, the fans run more efficiently at lower speeds across the server workload, which lowers fan power, reduces cooling costs, and decreases acoustics of the chassis.
Thermally optimized chassis
The ability of the MX7000 chassis to air-cool the eight slots is directly affected by the storage configuration of each sled as well as the placement of sleds in the chassis. For example: Pulling air through a sled that has six hard drives is harder than with a sled that has four hard drives. Machine learning is built into the sled and chassis firmware to dynamically analyze the ability of the chassis to deliver air-cooling to each sled.
A consistent storage configuration maximizes cooling across all sleds and enables the MX760c to support up to six 2.5-in. storage devices with the latest 4th Gen Intel® Xeon® CPUs.
A varied storage configuration with MX760c sleds enables support for up to four 2.5-in. storage devices to maximize cooling through each sled.
MX7000 air-cooling enhancements
MX7000 chassis and MX sleds introduced the capability to dynamically calculate the cooling based on the chassis configuration. This capability enabled Dell to offer a thermally optimized chassis with a consistent storage configuration that increases cooling for sleds by 20 percent. Dell used this additional cooling capability to offer high-power CPUs with storage configurations that were not supported by previous generations.
The industry trend of increasing power per node every generation has significantly challenged the ability to deliver air-cooled solutions. The MX7000 chassis introduced the next-generation Gold Grade chassis fans with the MX760c sleds to provide an air-cooled solution with the latest high-powered CPUs. Gold Grade fans deliver 25 percent more cooling per sled than the previous-generation Silver Grade fans.
Enterprise Infrastructure Planning Tool
The Dell Enterprise Infrastructure Planning Tool (EIPT) helps IT professionals plan and tune their systems and infrastructure for maximum efficiency. Customers can model their customized MX7000 chassis and sled configurations in EIPT. The trained machine learning model enables the tool to identify the maximum data center ambient temperature supported by the sleds. It also identifies the most thermally optimized configuration when sleds have a varied storage configuration. This means that new and existing customers can identify the most efficient sled-to-slot configuration to optimize their chassis for maximum cooling capability while lowering power, costs, and fan noise.
Conclusion
Dell continues to deliver innovative solutions that expand the air-cooled feature-rich configuration choices for the PowerEdge MX7000 chassis and server sleds. Dell Engineering combined machine learning technology with next-generation fans to provide customers the latest high-performance CPUs with more local storage configurations than previous generations in an air-cooled chassis. In addition to the expanded air-cooling configurations, Dell also offers Direct Liquid Cooling (DLC) for the PowerEdge MX7000 chassis and server sleds. The features and potential benefits of DLC are discussed in a separate Direct from Development tech note.
References
- Tech Talk Video: The MX7000 Introduces a New Thermal Innovation
- Direct from Development Tech Note: The History of Server and Data Center Cooling Technologies