Data center upgrades and patch management are typically manual, repetitive tasks prone to configuration and implementation errors. Validation testing of software and hardware firmware to ensure interoperability among components when one component is upgraded requires extensive quality assurance testing in staging environments. IT must sometimes make the difficult decision to deploy new patches before they are fully vetted or to defer new patches, which slow down the roll-out of new features, security, and fixes. Both situations increase risk for the customer environment.
Learning about the VMware Cloud Foundation concept of a Workload Domain can help customers better understand life cycle operations details. A Workload Domain is a policy-based resource container with specific availability and performance attributes that combines compute (vSphere), storage (vSAN), and networking (NSX) into a single consumable entity. In the case of running VMware Cloud Foundation on VxRail, these workload domains are built using VxRail clusters and leverage the native VxRail operations experience for tasks such as automated cluster builds and cluster expansions as examples.
Infrastructure building blocks can be created based on native VxRail clusters that can scale up and out incrementally. Customers can scale up leveraging the flexible hardware configurations available within a VxRail node to increase storage capacity or memory. Customers can similarly scale out by adding nodes in single node increments to a cluster. The physical compute, storage, and network infrastructure becomes part of a single shared pool of virtual resources that is managed as one cloud infrastructure ecosystem using the SDDC Manager.
From this shared pool, customers can organize separate pools of capacity into workload domains, each with its own set of specified CPU, memory, and storage requirements to support various workload types such as cloud native, VDI, or business critical applications, such as databases. As new VxRail physical capacity is added, it will be recognized by the SDDC Manager and be made available for consumption as part of a workload domain. Scaling workload domains beyond a single cluster gets even easier with the ability to add multiple VxRail clusters within a workload domain.
As mentioned previously, Workload Domains can be created, expanded, and deleted. They can also be upgraded independently, providing customers with the flexibility to align workload domain infrastructure requirements to the applications running on them. This can be done even at the individual cluster level within a domain. With VMware Cloud Foundation, all life cycle management occurs at the workload domain level, enabling flexibility to mix and align workloads to the appropriate underlying infrastructure dependencies.
VMware Cloud Foundation on VxRail leverages both the native VMware Cloud Foundation and VxRail HCI System Software update bundles for its updates. This allows customers to take advantage of new platform features faster. There is no proprietary package that must be generated to run VMware Cloud Foundation on VxRail that would delay the availability of these updates from being published for customer consumption when the updates are available. This allows both VMware and Dell to innovate faster within their respective layers asynchronously, updating the features without affecting the other layers of the platform stack. It also means that VMware and Dell can continue to leverage their respective streamlined development and release processes for both VxRail and VMware Cloud Foundation independently.
VxRail life cycle management is built on ecosystem connectors to integrate vSAN cluster software and PowerEdge server hardware so that the ESXi host can be managed as a single system. This system integration enables the automation and orchestration necessary to deliver nondisruptive, streamlined HCI stack upgrades. VxRail life cycle management delivers a differentiated value on its ability to deliver a prevalidated set of software and firmware that ensures compatibility and compliance of the entire configuration on HCI stack. It does that while maintaining the performance and availability required of the virtualized workloads running on the clusters.
Continuously Validated States describe the ability to test, validate, and produce a VxRail software bundle to support every vSphere release, any-to-any version upgrade path, and the millions of VxRail configurations. These Continuously Validated States are recorded on the Electronic Compatibility Matrix. The VxRail team’s $60 million in equipment investment with a team of more than 100 members dedicated to testing and quality makes this possible.
All VMware Cloud Foundation on VxRail life cycle updating and upgrade operations are orchestrated using SDDC Manager. It is responsible for monitoring the respective VMware and Dell support repositories where the VMware Cloud Foundation and VxRail update bundles are published. The various VMware Cloud Foundation update bundles include bundles for vCenter updates, NSX, SDDC Manager, and vRealize Suite Lifecycle Manager (vRSLCM). Aria Suite (formerly vRealize Suite) components (vRealize Automation, vRealize Operations, and vRealize Log Insight) are then updated by applying respective component update bundles using vRSLCM. Aria Suite has been engineered to be VMware Cloud Foundation aware, and VMware Cloud Foundation is Aria Suite aware. Each VMware Cloud Foundation release includes a qualified version of the vRSLCM in the release software bill of materials (BOM). The SDDC Manager can be used to optionally deploy the vRealize Suite Lifecycle Manager (vRSLCM) and, in doing so, establish a two-way communication channel between these two products. vRSLCM is then “VMware Cloud Foundation aware” and reports back to the SDDC Manager what vRealize components are installed.
The native VxRail update bundle includes ESXi, vSAN, VxRail Manager, hardware firmware and drivers. As a part of this monitoring, SDDC Manager automatically discovers when new VxRail and VMware Cloud Foundation updates are available for download and proactively notifies the administrator accordingly within the user interface.
SDDC Manager will also ensure that all update bundles are automatically curated, guaranteeing visibility and access to only the updates that have been qualified and supported for the system configuration it is managing. For example, an update cannot be accessed for a workload domain until first applied to the management domain. SDDC Manager even controls the ordering of life cycle management updates to ensure that a bundle version cannot be applied without first verifying that all update pre-requisites are met first. This helps mitigate risk so that the system is always at a known good state from one version to the next. This removes any need for the administrator to guess about valid releases or to cross-reference support matrixes to ensure update bundle compatibility across the system.
All updates are scheduled, performed, and orchestrated by SDDC Manager but may be performed by SDDC Manager or VxRail Manager using integrated APIs as shown in Figure 15.
Once a set of updates has been downloaded, SDDC Manager is used to schedule the updates to be applied to each of the workload domains in the environment independently.
Lifecycle management in SDDC Manager can be applied to the Management Domain, which contains SDDC software stack or to individual workload domains and does not disrupt tenant virtual machines (VMs). Using live VM migration together with vSphere Dynamic Resource Scheduler (DRS), SDDC Manager can update software to improve infrastructure security and reliability. VMware and Dell perform extensive validation testing of the software stack before releasing software updates, which reduces risk and helps to instill confidence.
The SDDC Manager Lifecycle Management view provides notification of update availability and download of the update bundle. The SDDC Manager interface also provides for selecting update targets and scheduling the update. It is highly recommended to schedule updates at a time when the SDDC Manager is not in heavy use and avoid any changes to the domains until after the upgrade completes.
Before starting the update, there are prerequisite tasks to ensure that the system is in a healthy state. The precheck utility can be manually triggered in the SDDC Manager update screen as shown in Figure 16.
These VMware Cloud Foundation prechecks also natively integrate with VxRail Health Check APIs to capture native VxRail cluster-specific hardware and software health.
The update bundles can be scheduled for automatic installation which can be applied to any cluster within any workload domain across data centers and across the edge. Administrators can select and schedule which clusters in a multicluster WLD they want to update, essentially allowing for control over the order of which clusters get updated first in LCM operation. This allows the cloud administrator to target specific workloads or environments (development vs. production, for example) for updates independently of the rest of the environment.
For native VMware Cloud Foundation software updates, SDDC Manager performs the automated workflows that are required to apply those updates to the clusters within a workload domain.
For native VxRail updates, SDDC Manager orchestrates the LCM process for a given workload domain but leverages the native VxRail Manager that runs on each VxRail cluster in that workload domain to apply the VxRail update using integrated VxRail Manager REST API calls in the background. As VxRail Manager performs the cluster update, SDDC Manager monitors its progress, and the VxRail Manager will notify the user when the process is complete. In a multicluster workload domain example, this process of SDDC Manager automatically calling out a VxRail cluster’s VxRail Manager APIs occurs automatically. It involves no administrator input until all clusters in the workload domain have been updated.
All these co-engineered features drive the full-stack integration life cycle management experience only available with VMware Cloud Foundation on VxRail. This integration offers a true better together experience to help Dell customers simplify and accelerate their IT transformation.
VMware and Dell Technologies constantly strive to improve the automated life cycle management experience that is integrated in the platform. Starting with VMware Cloud Foundation 4.0.1 on VxRail 7.0, customers can upgrade specific clusters within a workload domain. This provides administrators with more flexibility in planning maintenance windows. VMware Cloud Foundation also supports NSX Edge cluster-level and parallel upgrades that offer more flexibility and efficiency in updating this critical component of the platform and better alignment with maintenance windows. VMware Cloud Foundation skip levels are also supported and can be performed from the SDDC Manager web-based UI. This provides additional efficiency by eliminating the requirement to install intermediate stepwise upgrades for customers who are performing LCM operations of the platform less often. The updated SDDC Manager LCM Manifest architecture also allows VMware and customers to respond more quickly to potential changes introduced in upgrade sequencing to provide more agility and further reduce risks related to software and hardware firmware upgrades.
To avoid any potential issues during LCM activities, VMware Cloud Foundation administrators can run SDDC Manager prechecks to weed out any issues before any LCM operation is run. VMware Cloud Foundation on VxRail includes an extensive set of integrated SDDC and VxRail specific health prechecks that have been integrated with the native SDDC Manager precheck workflows to identify many of the common system states that could cause LCM operations issues. Prechecks include password validity (including expired passwords), file system permissions, file system capacity, CPU reservation for NSX Managers, hosts in maintenance mode, and DRS configuration mode, among others. The following figure illustrates some examples of what these prechecks look like from the SDDC Manager UI.
Recent VMware Cloud Foundation enhancements include the concept of flexible cloud management LCM operations. In VMware Cloud Foundation 4.4 and later, Aria Suite component updating and upgrading operations are managed independently from VMware Cloud Foundation using vRealize Suite Lifecycle Manager (vRSLCM) directly. Thus, administrators can upgrade Aria products independently from the core VMware Cloud Foundation upgrade to better align with business requirements. This functionality also helps simplify the core VMware Cloud Foundation upgrade process. Aria Suite upgrades do not have to be performed for a VMware Cloud Foundation upgrade if existing components are still compatible with the version being upgraded to. Thus, administrators have more flexibility on which Aria Suite components are updated and when they are updated.
Reducing maintenance window timelines is always a design goal for VMware Cloud Foundation on VxRail engineering teams to deliver on for IT teams. This is especially true in circumstances where an LCM operation has not fully completed for all hosts in a cluster. It would be inefficient, adding unnecessary time to maintenance windows, to have to start an LCM process from the beginning on hosts where an update was already successful. To avoid this situation, VMware Cloud Foundation on VxRail implements a jointly engineered LCM method that is available through the VxRail Retry API and is fully integrated with SDDC Manager. It adds logic that allows the new cluster LCM update retry function to target only the failed nodes. This enhancement drastically reduces LCM upgrade times and helps IT teams meet their maintenance windows, especially for VMware Cloud Foundation on VxRail deployments with many large workload domain clusters. It also demonstrates close collaboration and commitment from Dell and VMware engineering teams for continuous improvements of the platform based on customer feedback, and deep integration between VMware Cloud Foundation software and VxRail engineered system.
The VMware Cloud Foundation Async Patch Tool is a CLI-based tool that allows cloud administrators to apply individual component out-of-band security patches to their VMware Cloud Foundation on VxRail environment, separate from an official VMware Cloud Foundation LCM update release. This enables organizations to address security vulnerabilities faster without having to wait for a full VMware Cloud Foundation release update. It also gives administrators control to install these patches without requiring the engagement of support resources.
VMware Cloud Foundation on VxRail supports the ability to use the VMware Cloud Foundation Async Patch Tool for ESXi, vCenter, NSX, and VxRail Manager security patch updates. Once patches have been applied and a new VMware Cloud Foundation BOM update that includes the security fixes is available, administrators can use the tool to download the latest VMware Cloud Foundation LCM release bundles.They can then upgrade their environment to an official in-band VMware Cloud Foundation release BOM. Administrators can then continue to use the native SDDC Manager LCM workflow process to apply additional VMware Cloud Foundation on VxRail upgrades.