Take Advantage of the Latest Enhancements to VxRail Life Cycle Management
Tue, 20 Jun 2023 16:52:40 -0000|
Read Time: 0 minutes
Providing the best life cycle management experience for HCI is not easy, nor is it a one-time job for which we can pat ourselves on the back and move on to the next endeavor. It’s a continuous cycle that incorporates feature enhancements and improvements based on your feedback. While we know that improving VxRail LCM is vitally important for us to continue to deliver differentiating value to you, it is just as important that your clusters continue to run the latest software to realize the benefits. In this post, I’ll provide a deep dive into the LCM enhancements introduced in the past few software releases so you can consider the added functionality that you can benefit from.
Focus areas for improved LCM
Going back into last year, we prioritized four focus areas to improve your LCM experience. While the value is incremental when you look at just a single software release, this post provides a holistic perspective of how VxRail has improved upon LCM over time to further increase the efficiencies that you enjoy today.
- Based on data that we have gathered on reported cluster update failures, we found that almost half of the update failures occurred because a node failed to enter maintenance mode. Effectively addressing this issue can potentially be the most impactful benefit for our customer base.
- As the VxRail footprint expands beyond the data center, resource constraints such as network bandwidth and Internet connectivity can become significant hurdles for effectively deploying infrastructure solutions at the edge. Recent enhancements in VxRail focused on creating space-efficient LCM bundle transfers.
- Doing more with less is a common thread across all organizations and industries. In the context of VxRail LCM, we’re looking to further simplify your cluster update planning experience by putting more actionable information at your fingertips.
- While no product, including VxRail, can avoid a failure from ever happening, VxRail looks to put you in a better position to protect your cluster and quickly recover from a failure.
Figure 1. 12+ month recap of LCM enhancements
Now that you know about the four focus areas, let’s get into the details about the actual improvements that have been introduced in the last 12+ months.
Mitigating maintenance mode failures
In our investigation, we were able to identify three major issues that caused a cluster update failure because a node did not enter maintenance mode accordingly:
- VMtools was still mounted on a VM.
- VMs were pinned to a host due to an existing policy.
- vSAN resynchronization was taking too long and exceeded the timeout value.
In VxRail 7.0.350, prechecks were added for the first two issues. When a pre-update health check is run, these new VxRail prechecks identify those issues if they exist and alert you in the report so that you can remedy the issue before initiating a cluster update. In the same release, the timeout value to wait for a node to enter maintenance mode was doubled to reduce the chance that vSAN resynchronization does not finish in time.
Next, the cluster update capability set was also enhanced to address a cluster update failure due to a node not entering maintenance mode as expected. With the combination of enhancements made to cluster update error handling and cluster update retry operations in VxRail 7.0.350 and VxRail 7.0.400 respectively, VxRail is now able to handle this scenario much more efficiently. If a node fails to enter maintenance mode, the cluster update operation now skips the node and continues on to the next node instead of failing out of the operation altogether. Upon running the cluster update retry operation, VxRail can automatically detect which node requires an update instead of updating the entire cluster.
Space-efficient LCM bundle transfers
The next area of improvement addressed reducing the package sizes of the LCM bundles. A smaller package size can be very beneficial for bandwidth-constrained environments such as edge locations.
VxRail 7.0.350 introduced the capability for you to designate a local Windows client at your data center to be the central repository and distributor of LCM bundles for remote VxRail clusters that are not connected to the Internet. Using a separate PowerShell commandlet installed on the client, you can initiate space-efficient bundle transfers from the client to your remote clusters in your internal network. The transfer operation automatically scans the manifest of the Continuously Validated State (VxRail software version) running on the VxRail cluster and determines the delta compared to the requested LCM bundle. Instead of transferring the full LCM bundle, which is greater than 10 GB in size, it only packages the necessary installation files. A much smaller LCM bundle can cut down on bandwidth usage and transfer times.
Figure 2. Central repository and distributor of LCM bundles to remote VxRail clusters
In VxRail 7.0.450, space-efficient LCM bundles can also be created when VxRail Manager downloads an LCM bundle from the Dell cloud. This feature requires that the VxRail Manager be connected to the Dell cloud.
Simplified cluster update planning experience
The next set of LCM enhancements is centered around providing you with critical insights to maximize the probability of a successful cluster update and for the information to be up-to-date and readily available whenever you need it.
Since VxRail 7.0.400, the pre-update health check includes a RecoverPoint for VMs compatibility precheck to detect whether its current version of software is compatible with the target VxRail software version.
VxRail 7.0.450 increased the frequency at which the VxRail prechecks file is updated. The increased frequency ensures that any additional prechecks added by engineering because of technology changes or new learnings from support cases are incorporated into the VxRail prechecks file that is run against your cluster. When your cluster is connected to the Dell cloud, VxRail Manager periodically scans for the latest VxRail prechecks file.
VxRail 7.0.450 also automated the health check to run every 24 hours. The combination of automated VxRail prechecks file scans and health check runs ensure that you have access to an up-to-date health check report once you log in to VxRail Manager.
VxRail 7.0.450 also further simplified your cluster update planning experience by consolidating into a single, exportable report all the necessary insights about your cluster to help you decide whether to move forward with a cluster update. This update advisor report has four sections:
- VxRail Update Advisor Report Summary includes the current VxRail version running on the cluster, the target (or selected) VxRail version, estimated duration to complete a cluster update, a link to the release notes, and information about your backup for your service VMs.
Figure 3. Update advisor report—summary report
- VxRail Components shows which components need to be updated to get to the target VxRail version. The table includes the current version and target version for each component.
Figure 4. Update advisor report—components report
- VxRail Precheck is the previously mentioned pre-update health check report, inclusive of all the enhancements discussed.
Figure 5. Update advisor report—LCM precheck report
- VxRail Custom Components is a report that highlights user-managed components installed on the cluster. You should consider these custom components when deciding whether to schedule a cluster update.
Figure 6. Update advisor report—custom components report
When VxRail Manager is connected to the Dell cloud, it automatically scans for new update paths. Once a new update path is detected, VxRail Manager downloads a lightweight manifest file that contains all the information needed to produce the update advisor report. The report is automatically generated every 24 hours. This feature is designed to streamline the availability of up-to-date critical insights to help you make an informed decision about a cluster update.
The last set of LCM enhancements that I will cover is around serviceability. While many of the features discussed earlier are meant to be proactive and to prevent failures, there are times when failures can still occur. Being able to efficiently troubleshoot the issues is critically important to getting your clusters back up and running quickly.
In VxRail 7.0.410, the logging capability was enhanced in a couple of areas so that the Dell Support team can pinpoint issues faster. When a pre-update health check identifies failures, the offending host is now recorded. If a node does fail to enter maintenance mode, the logs now capture the reason for the failure.
In VxRail 7.0.450, we automated the backup of the VxRail Manager VM and vCenter Server VM (if it’s VxRail managed). Now you can easily back up your service VMs before updating a cluster.
Figure 7. Automate VxRail backup of service VMs before a cluster update
This feature is also integrated into the update advisor report, where you can see the latest backup on the report summary and click a link to go to the backup page to create another backup.
Value of VxRail life cycle management
If life cycle management is one of the major reasons that you chose to invest in VxRail, our continuous improvements to life cycle management should be a compelling reason to keep your clusters running the latest software. VxRail life cycle management continues to provide significant value by addressing the challenges that your organization faces today.
Figure 8. VxRail benefits (data from "The Business Value of Dell VxRail HCI," April 2023, IDC)
In an IDC study sponsored by Dell Technologies, The Business Value of Dell VxRail HCI, the value that VxRail LCM provides to organizations is significant and compelling. The results of this study are major proof points on why you should continue investing in VxRail to mitigate these challenges:
- Overburdened IT staff. The automated LCM and mechanisms in VxRail to maintain cluster integrity throughout the life of the cluster drives significant efficiencies in your IT infrastructure team.
- Unplanned outages that lead to significant disruption to businesses. The benefit of pretested and prevalidated sets of drivers, firmware, and software which we call VxRail Continuously Validated States is the significant reduction in risk as you update your HCI cluster from one version to the next.
- More time spent on deploying infrastructure and resulting slowdown of pace at which your business can innovate. The automation and integrated validation checks speeds up deployment times without compromising security.
The emphasis that we put on improving your LCM experience is extraordinary, and we encourage you to maximize your investment in VxRail. Updating to the latest VxRail software release gives you access to the many LCM enhancements that can drive greater efficiencies in your organization. And with VxRail Continuously Validated States, you can safely get to the next software release and the ones that follow.
For more information about the features in VxRail 7.0.400, check out this blog post:
For more information about the features in VxRail 7.0.450, see this post:
If you want to learn about the latest in the VxRail portfolio, you can check the VxRail page on the Dell Technologies website:
Author: Daniel Chiu, VxRail Technical Marketing
Related Blog Posts
Learn About the Latest Major VxRail Software Release: VxRail 7.0.450
Thu, 11 May 2023 16:14:15 -0000|
Read Time: 0 minutes
To our many VxRail customers, you know that our innovation train is a constant machine that keeps on delivering more value while keeping you on a continuously validated track. The next stop on your VxRail journey brings you to VxRail 7.0.450 which offers significant benefits to life cycle management and dynamic node clusters.
This blog provides a deep dive into some of the life cycle management enhancements as well as PowerStore Life Cycle Management integration into VxRail Manager for VxRail dynamic node clusters. For a more comprehensive rundown of the features introduced in this release, see the release notes.
Life cycle management
The life cycle management features that I am covering can provide the most impact to our VxRail customers. The first set of features are designed to offer you actionable information at your fingertips. Imagine taking your first sip of coffee or tea as you log onto VxRail Manager at the start of your day, and you immediately have all the up-to-date information that you need to make decisions and plan out your work.
VxRail pre-update health check
The VxRail pre-update health check, or pre-check as the VxRail Manager UI refers to it, has been an important tool for you to determine the overall health of your clusters and assess the readiness for a cluster update. The output of this report brings helps you to be aware of troublesome areas and provides you with information, such as Knowledge Base articles, to resolve the issues. This tool relies on a script that can be automatically uploaded onto the VxRail Manager VM, if the cluster is securely connected to the Dell cloud, or manually uploaded as a bundle procured from the Dell Support website.
For the health check to stay reliable and improve over time, the development of the health check script needs to incorporate a continuous feedback loop so that the script can easily evolve. Feedback can come from our Dell Services and escalation engineering teams as they learn from support cases, and from the engineering team as new capabilities and additions are introduced to the VxRail offering.
To provide an even more accurate assessment of the cluster health and readiness for a cluster update, the VxRail team has increased the frequency of how often the health check script is updated. Starting with VxRail 7.0.450, clusters that are connected to the Dell cloud will automatically scan for new health check scripts multiple times per day. The health check will automatically run every 24 hours, with the latest script in hand, so that you will have an up-to-date report ready for your review whenever you log onto VxRail Manager. This enhancement has just made the pre-update health check even more reliable and convenient.
For clusters that are not connected to the Dell cloud, you can still benefit from the increased frequency of health script updates. However, you are responsible for checking for any updates on the Dell Support website, downloading them, and staging the script on VxRail Manager for the tool to utilize it.
VxRail cluster update planning
The next enhancement that I will delve into provides a simpler and more convenient cluster update planning experience. VxRail 7.0.450 introduces more automation into the cluster update planning operations, so that you have all the information that you need to plan for an update without manual intervention.
For a cluster connected to the Dell cloud, VxRail Manager will automatically scan for new update paths that are relevant to that particular cluster. This scan happens multiple times a day. If a new update path is found, VxRail Manager will download the lightweight manifest file from that target LCM composite bundle. This file provides the metadata of the LCM composite bundle, including the manifest of the target VxRail Continuously Validated State.
The following figure shows the information of two update paths provided by their manifest files to populate the Internet Updates tab. That information includes the target VxRail software version, estimated cluster update time, link to the release notes, and whether reboots are required for the nodes to complete an update to this target version. (You can disregard the actual software version numbers: these are engineering test builds used to demonstrate the new functionality.)
VxRail Manager, by default, will recommend the next software version on the same software train. For the recommended path, VxRail Manager automatically generates an update advisor report which is the new feature for cluster update planning. An update advisor report is a singular exportable report that consolidates the output from existing planning tools:
- Same metadata of the update path, as provided on the Internet Updates tab:
- The update advisory report that provides component-by-component change analysis, which helps users build IT infrastructure change reports:
- The health check report that was discussed earlier:
- The user-managed component report that reminds users whether they need to update non-VxRail managed components for a cluster update:
This report is automatically generated every 24 hours so that you can log onto VxRail Manager and have all the up-to-date information at your disposal to make informed decisions. This feature will make your life easier because you no longer have to manually run all these jobs and wait for them to complete!
For a non-recommended update, you can manually generate an update advisor report using the Actions button for the listed update path. For clusters not connected to Dell cloud, you can still benefit from the update advisor report. However, instead of downloading a lightweight manifest file, you would have to download the full LCM bundle from the Dell Support website to generate the report.
The last life cycle management feature that I want to focus on is about smart bundles. The term ‘smart bundle’ refers to a space-efficient LCM bundle that can be downloaded from the Dell cloud. For VxRail users who are using CloudIQ today to manage their VxRail clusters, this feature is familiar to you. A space-efficient bundle is created by first performing a change analysis of the VxRail Continuously Validated State currently running on a cluster versus the target VxRail Continuously Validated State that a user wants to download for their cluster. The change analysis determines the delta of install files in the full LCM bundle that is needed by the cluster to download and update to the target version.
In VxRail 7.0.450, you can now initiate smart bundle transfers from VxRail Manager. Smart bundles can greatly reduce the transfer size of an update bundle, which can be extremely beneficial for bandwidth-constrained environments. To use the smart bundle feature, the cluster has to be configured to connect to CloudIQ in the Dell cloud. If VxRail Manager is not properly configured to use the smart bundle feature or if the smart bundle operation fails, VxRail Manager defaults to using the traditional method of downloading the full LCM bundle from the Dell cloud.
VxRail dynamic nodes with PowerStore
VxRail 7.0.450 introduces the much-anticipated integration of PowerStore life cycle management into VxRail Manager for a configuration consisting of VxRail dynamic nodes using PowerStore as the primary storage (also referred to as Dynamic AppsON). This integration further centralizes PowerStore management onto the vCenter Server console for VMware environments. With the Virtual Storage Integrator (VSI) plugin to vCenter, you have been able to provision PowerStore storage and manage data services. Now, you can use the VxRail Manager plugin to manage a PowerStore update and view the array’s software version.
To enable this functionality, VxRail leverages the VSI’s new API server to communicate with the PowerStore Manager and initiate lifecycle management operations and retrieve status information. The API server was developed exclusively for VxRail Manager in a Dynamic AppsON configuration. You start the LCM workflow by first uploading the update bundle to PowerStore Manager, then running an update pre-check, and lastly running the update. The operations are initiated from VxRail Manager but the actual operations are executed on the PowerStore Manager.
The following video shows the PowerStore LCM workflow that can be run from the VxRail Manager. You can update a PowerStore that is using any storage type, except NFS, as the primary storage for a VxRail dynamic node cluster.
Although VxRail 7.0.450 is a jam packed release with many new features and enhancements, the features I’ve described are the headliners and deserve a deeper dive to unpack the capability set. Overall, the set of LCM enhancements in this release provides immense value for your future cluster management and update experience. For the full list of features introduced in this release, see the release notes. And for more information about VxRail in general, check out the Dell VxRail Hyperconverged Infrastructure page on www.dell.com.
Author: Daniel Chiu
Learn More About the Latest Major VxRail Software Release: VxRail 7.0.480
Tue, 24 Oct 2023 15:51:48 -0000|
Read Time: 0 minutes
Happy Autumn, VxRail customers! As the morning air gets chillier and the sun rises later, this blog on our latest software release – VxRail 7.0.480 – paired with your Pumpkin Spice Latte will give you the boost you need to kick start your day. It may not be as tasty as freshly made cider donuts, but this software release has significant additions to the VxRail lifecycle management experience that can surely excite everyone.
VxRail 7.0.480 provides support for VMware ESXi 7.0 Update U3o and VMware vCenter 7.0 Update U3o. All existing platforms that support VxRail 7.0, except ones based on Dell PowerEdge 13th Generation platforms, can upgrade to VxRail 7.0.480. This includes the VxRail systems based on PowerEdge 16th Generation platforms that were released in August.
Read on for a deep dive into the VxRail Lifecycle Management (LCM) features and enhancements in this latest VxRail release. For a more comprehensive rundown of the features and enhancements in VxRail 7.0.480, see the release notes.
Improving update planning activities for unconnected clusters or clusters with limited connectivity
VxRail 7.0.450, released earlier this year, provided significant improvements to update planning activities in a major effort to streamline administrative work and increase cluster update success rates. Enhancements to the cluster pre-update health check and the introduction of the update advisor report were designed to drive even more simplicity to your update planning activities. By having VxRail Manager automatically run the update advisor report, inclusive of the pre-update health check, every 24 hours against the latest information, you will always have an up-to-date report to determine your cluster’s readiness to upgrade to the latest VxRail software version.
If you are not familiar with the LCM capabilities added in VxRail 7.0.450, you can review this blog for more information.
VxRail 7.0.450 offered a seamless path for clusters that are connected to the Dell cloud to take advantage of these new capabilities. Internet-connected clusters can automatically download LCM pre-checks and the installer metadata files, which provide the manifest information about the latest VxRail software version, from the Dell cloud. The ability to periodically scan the Dell cloud for the latest files ensures the update advisor report is always up to date to support your decision-making.
While unconnected clusters could use these features, the user experience in VxRail 7.0.450 made it more cumbersome for users to upload the latest LCM pre-checks and installer metadata files. VxRail 7.0.480 aims to improve the user experience for those who have clusters deployed in dark or remote sites that have limited network connectivity.
Starting in VxRail 7.0.480, users of unconnected clusters will have an easier experience uploading the latest LCM pre-checks file onto VxRail Manager. The VxRail Manager UI has been enhanced, so you no longer have to upload via CLI.
Knowing that some clusters are deployed in areas where network bandwidth is at a premium, the VxRail Manager UI has also been updated so that you only need to upload the installer metadata file to generate the update advisor report. In VxRail 7.0.450, users had to upload the full LCM bundle for the update advisor report. The difference in the payload size of greater than 10GB for a full LCM bundle versus a 50KB installer metadata file is a tremendous improvement for bandwidth-constrained clusters, eliminating a barrier to relying on the update advisor report as a standard cluster management practice. With VxRail 7.0.480, whether you have connected or unconnected clusters, these update planning features are easy to use and will help increase your cluster update success rates.
To accommodate these improvements, the Local Updates tab has been modified to support these new capabilities. There are now two sub-tabs underneath the Local Updates tab:
- The Update sub-tab represents the existing cluster update workflow where you would upload the full LCM bundle to generate the update advisor report and initiate the cluster update operation.
- The Plan and Update sub-tab is the recommended path which incorporates the enhancements in VxRail 7.0.480. Here you can upload the latest LCM pre-checks file and the installer metadata file that you found and downloaded from the Dell Support website. Uploading the LCM pre-checks file is optional to create a new report because there may not always be an updated file to apply. However, you do need to upload an installer metadata file to generate a new report from here. Once uploaded, VxRail Manager will generate an update advisor report against that installer metadata file every 24 hours.
Figure 1. New look to the Local Updates tab
Easier record-keeping for compliance drift and update advisor reports
VxRail 7.0.480 adds new functionality to make the compliance drift reports exportable to outside the VxRail Manager UI while also introducing a history tab to access past update advisor reports.
Some of you use the contents of the compliance drift report to build out a larger infrastructure status report for information sharing across your organizations. Making the report exportable would simplify that report building process. When exporting the report, there is an option to group the information by host if you prefer.
Note that the compliance check functionality has moved from the Compliance tab under the Updates page to a separate page, which you can navigate to by selecting Compliance from under the VxRail section.
Figure 2. Exporting the compliance drift report
The exit of the Compliance tab comes with the introduction of the History tab on the Updates page in VxRail 7.0.480. Because VxRail Manager automatically generates a new update advisor report every 24 hours and you have the option to generate one on-demand, the update advisor report is often overwritten. To avoid the need to constantly export them as a form of record-keeping, the new History tab stores the last 30 update advisor reports. The reports are listed in a table format where you can see which target version the report was run against and when it was run. To view the full report, you can click on the icon on the left-hand column.
Figure 3. New History tab to store the last 30 update advisor reports
Addressing cluster update challenges for larger-sized clusters
For some of you that have larger-sized clusters, cluster updates pose challenges that may prevent you from upgrading more frequently. For example, the length of the maintenance window required to complete a full cluster update may not fit within your normal business operations such that any cluster update activity will impact service availability. As a result, cluster updates are kept to a minimum and nodes inevitably are not rebooted for long periods of time. While the cluster pre-update health check is an effective tool to determine cluster readiness for an upgrade, some issues may be lurking that a node reboot can uncover. That’s why some of you script your own node reboot sequence that acts as a test run for a cluster upgrade. The script reboots each node one at a time to ensure service levels of your workloads are maintained. If any nodes fail to reboot, you can investigate those nodes.
VxRail 7.0.480 introduces the node reboot sequence on VxRail Manager UI so that you do not have to manage your scripts anymore. The new feature includes cluster-level and node-level prechecks to ensure it is safe to perform this activity. If nodes fail to reboot, there is an option for you to retry the reboot or skip it. Making this activity easy may also encourage more customers to do this additional pre-check before upgrading their clusters.
Figure 4. Selecting nodes in a cluster to reboot in sequential order
Figure 5. Monitoring the node reboot sequence on the dashboard
VxRail 7.0.480 also provides the capability to split your cluster update into multiple parts. Doing so allows you to separate your cluster upgrade into smaller maintenance windows and work around your business operation needs. Though this capability could reduce the impact of a cluster upgrade to your organization, VMware does recommend that you complete the full upgrade within one week given that there are some Day 2 operations that are disabled while the cluster is partially upgraded. VxRail enables this capability only through VxRail API. When a cluster is in a partially upgraded state, features in the Updates tab are disabled and a banner appears alerting you of the cluster state. Cluster expansion and node removal operations are also unavailable in this scenario.
The new lifecycle management capabilities added to VxRail 7.0.480 are part of the continual evolution of the VxRail LCM experience. They also represent how we value your feedback on how to improve the product and our dedication to making your suggestions come to fruition. The LCM capabilities added to this software release will drive more effective cluster update planning, which will result in higher rates of cluster update success that will drive more efficiencies in your IT operations. Though this blog focuses on the improvements in lifecycle management, please refer to the release notes for VxRail 7.0.480 for a complete list of features and enhancements added to this release. For more information about VxRail in general, visit the Dell Technologies website.
Author: Daniel Chiu