VxRail management has expanded beyond the VxRail Manager plug-in for vCenter to allow for different use cases. VxRail Manager is a plug-in on vCenter that provides a fully, integrated experience to manage VxRail clusters on a familiar interface. REST APIs extends the VxRail LCM capabilities for cloud deployment solutions or for organizations looking to deploy and manage VxRail clusters at scale where running batch scripts, configuration management tools (such as Ansible, Puppet, etc.) or custom automation for cluster operations is more efficient. Cloud-based multi-cluster management is a new cloud-based management option for global orchestration of all the customer’s clusters from a single web portal interface. While VxRail Manager provides the complete management, capability set for VxRail clusters, managing using REST APIs and cloud-based multi-cluster management have their benefits. Over time, the gaps in functionality will close to further enhance the value each brings for their respective use cases.
VxRail Manager features user-friendly workflows for automating VxRail deployment and configuration and monitoring the health of individual systems in the entire cluster. It also incorporates functionality for hardware serviceability and system platform lifecycle management. For instance, it guides system administrators through adding new systems to an existing cluster, and it automatically detects new systems when they come online. VxRail Manager is also used to replace failed disk drives without disrupting availability to generate and download diagnostic log bundles and apply VMware updates or software patches non-disruptively across VxRail nodes.
With VxRail Manager plug-in for vCenter Server, all VxRail Manager features are integrated with and accessible from the vCenter Server so that users can benefit from these valuable capabilities on a familiar management interface. With the VxRail Manager plug-in, the vCenter Server can manage physical hardware of the VxRail cluster.
Figure 7. VxRail Manager plug-in for vCenter Server
In addition to SRS-specific support, the VxRail Support page on vCenter Server links to VxRail Community pages for Dell Knowledge Base articles, user forums for FAQ information, and VxRail best practices. The following figure is an example of the support view.
Figure 8. VxRail Manager Support tab
VxRail Manager plug-in provides access to a digital market for finding and downloading qualified software packages such as VMware Horizon Cloud, Data Domain Virtual Edition, RecoverPoint for VM, and other software options for VxRail systems.
VxRail Manager drastically simplifies operations of the virtualized IT environment. VxRail APIs take this step further, by exposing VxRail Manager functionality through standard, easy to consume public APIs, which can be integrated into a broad spectrum of existing automation solutions. This applies not only to large enterprises and service providers, but also to midsize enterprises, with limited IT staff leveraging scripts for automating IT processes and tasks.
VxRail API can be used for the following use cases:
- Infrastructure as Code (IaC) environments to execute typical administrative tasks such as monitoring, querying, reboot/shutdown, lifecycle management updates from configuration management tools like Puppet, Ansible or Chef
- VMware administrators can use PowerCLI with a VxRail API Windows PowerShell module which simplifies the learning curve
- REST APIs can be leveraged by customers looking to use batch scripts or custom automation to manage clusters at scale
- To use VxRail as an essential building block for a fully automated VMware SDDC / hybrid cloud stack. VxRail can provide native, full stack integration with VMware Cloud Foundation platform
REST APIs are easy to explore and consume by accessing the latest API documentation through the web browser using the Swagger integration.
Figure 9. Connectivity of VxRail REST APIs
Cloud-based multi-cluster management
As stated in the introductory section of this Tech Book, the drive for digital information requires technologies that will greatly reduce the reliance on IT personnel to manage infrastructure. VxRail lifecycle manager is an example of VxRail technology that can reduce time spent managing infrastructure. To further enhance operational efficiency, AI-driven operations and multi-cluster management are areas where it can introduce more operational simplicity to cut down time needed to manage clusters at scale and operational intelligence to offload some of the decision-making burden of IT personnel for LCM and maintaining health of the clusters.
VxRail HCI System Software cloud-based multi-cluster management is a centralized data collection and analytics platform that streamlines the monitoring and management of multiple VxRail clusters for a customer, improves serviceability, and helps the customer make better decisions to manage performance and capacity of their HCI. It is a cloud-based analytics platform that leverages advanced telemetry collected from the VxRail clusters for its infrastructure machine learning to provide reporting and actionable insight. Its infrastructure machine learning utilizes built-in knowledge of Dell best practices and more than 700 common issues. Cloud-based multi-cluster management provides health scores for the entire HCI stack to enable customers to quickly identify areas to troubleshoot and to address areas to efficiently scale based on projected growth of IT resources.
How does it work?
Cloud-based multi-cluster management is available with no additional hardware or software required for the VxRail cluster. It relies on a data collector service provided by the VxRail HCI System Software running on the VxRail nodes to aggregate metrics from the vSAN cluster as well as from the VxRail system. Known as the Adaptive Data Collector, this service frequently transfers this aggregate bundle of data to the VxRail cloud-based platform using the same SRS conduit for dial home services. Because it uses SRS, a support account with MyService360 is required as well as configuration and enablement of SRS for data to be transferred to the VxRail data lake. This repository is housed at Dell. Using Pivotal Cloud Foundry as its cloud-based service platform, cloud-based multi-cluster management incorporates its infrastructure machine learning to produce reporting and insight to enable customers to improve serviceability and operational efficiencies. cloud-based multi-cluster management functionality is entirely consumed via a Dell-hosted web portal, called MyVxRail, which provides a single global view of the customer’s VxRail environment.
Figure 10. Cloud-based multi-cluster management connectivity
There are four settings for data collection frequencies: do not collect (NONE), once a day (BASIC), once every hour (MEDIUM, which is the default setting), or once every half hour (ADVANCED). Data collection frequency is configured in the telemetry settings either using REST API commands or the VxRail Manager plug-in. The timeliness of the content shown on MyVxRail is dependent on the frequency of the data collection a user configures for their clusters. Cloud-based multi-cluster management uses infrastructure machine learning to model and train data to create accurate predictions. The more data it can analyze, the better the models will be.
Cloud-based multi-cluster management features
Cloud-based multi-cluster management is designed for continuous innovation and continuous delivery so that frequent, incremental updates can be made to introduce new capabilities. It currently provides the following capability sets:
- Cloud-based management portal – cloud-based multi-cluster management is accessed from a cloud-based web portal, called MyVxRail. This web portal provides a customer with a central point of management for all their VxRail clusters. All features of cloud-based multi-cluster management are made available through MyVxRail.
- Global visualization – cloud-based multi-cluster management provides a centralized topology of all VxRail clusters in one global view, instead of locally managing VxRail clusters per vCenter Server. There are two views. Clusters are organized logically by the vCenter Servers in the Logical View and physically according to the geographic location depicted on a global map in the Physical View. As the user navigates from the vCenter Server down to individual VxRail nodes, corresponding information about the selected object, its health, its resource (CPU, memory, capacity, network) usage, and underlying VM counts are shown to the user.
- Simplified health scores – Identify and assess impact of existing and potential health issues at the cluster and node levels so the user can quickly identify and troubleshoot problem areas to improve performance, availability, and IT resource planning. Infrastructure machine learning is used to learn behavior patterns of VxRail clusters and more accurately identify anomalies that may signal potential issues to address.
- Advanced metrics charting – With intelligent health reporting, the user can pinpoint problem areas using metrics charting of CPU, memory, capacity, and networking resources.
- Future capacity planning – Infrastructure machine learning is also used to project future usage so the user can have better insight into current usage and projected IT resource needs.
- Lifecycle management – cloud-based multi-cluster management provides LCM planning and execution capabilities across multiple clusters with a single workflow. Perform on-demand pre-update cluster health checks (LCM pre-checks for short) to determine whether the cluster is ready to start the cluster update process and orchestrate update bundle downloads onto VxRail clusters. Once staged on the VxRail Manager VM on the cluster, a user can initiate the execution of a cluster update.
- Role-based access control – Integration with vCenter access control allows customers to regulate access and privileges to perform lifecycle management operations. MyVxRail can register to the vCenter Servers so that privileges such as: LCM pre-checks, update bundle download and staging, and cluster update can be managed using vCenter access control and enforced by MyVxRail.
- Credentials management – Credentials used to initiate a cluster update can be managed from MyVxRail to streamline cluster updates at scale. During initial setup of cluster update, credentials for vCenter Server, Platform Services Controller, and VxRail Manager are entered and saved locally on each cluster. When initiating a cluster update, MyVxRail can automatically provide the saved credentials to execute the update if the user is privileged through role-based access control. Management of the credentials can be further restricted to a smaller group of users using the ‘manage credentials’ privilege.
The features in cloud-based multi-cluster management touch upon various areas of system management. This section goes over some of the major use cases that the features were designed to address.
- Global health monitoring – the combination of global visualization and simplified health scores provide a convenient and streamlined way to assess the health of the entire VxRail footprint, along with the ability to single out clusters in need of attention. Upon login, a user can see all their clusters in a logical or physical view and be able to navigate through the topology. Health scores are integrated into the topology so that a user can see aggregate health scores. From a top-level view, a user can quickly spot poorly behaving sections of the topology and drill-down to narrow their focus for troubleshooting.
Figure 11. MyVxRail Summary tab
- Troubleshooting – cloud-based multi-cluster management helps users troubleshoot issues detected from the simplified health scores. Component failures, configuration issues, and performance anomalies are reflected in the health score. Performance anomalies relies on predictive analytics that determines a normal behavioral pattern and flags occurrences when a VxRail node is behaving abnormally. A user can drill-down at the list of issues that degraded the health score to understand the reason behind the issue. For some issues, Knowledge Base articles are provided to help troubleshoot the cause. For performance issues, the metrics charting function allows user to pinpoint the time of the issue
and analyze the networking, disk, memory, and capacity activity during that time.
Figure 12. MyVxRail Performance tab
- On-demand LCM pre-checks – While VxRail LCM simplifies much of the update process through automation and orchestration and configuration stability, finding out that a cluster is not ready for an update during the scheduled update window can be troublesome. With LCM pre-checks, a user can run the pre-check at any time to learn whether a cluster is ready for an update. Issues can be discovered and addressed during the update planning phase versus at the time of the update. This feature is also designed to incorporate the latest health checks so that the pre-check is as accurate as possible to determine cluster update readiness.
Figure 13. VxRail pre-check report
- Update bundle download and staging – Downloading VxRail update bundles across multiple VxRail clusters can be challenging. Some clusters may be individually managed because they are geographically dispersed. Some clusters may have network bandwidth issues. Cloud-based multi-cluster management provides the ability to orchestrate the downloads across many or all clusters in a single operation, which can offer significant time savings. This feature also can identify the delta of the current VxRail version and the target VxRail version so that only the required component installation files are packaged in the download versus downloading the entire update bundle. Bandwidth-strapped clusters can realize tremendous time savings, especially in cases where minor updates may require only a few component updates.
- Cluster update – Combined with the LCM pre-checks and update bundle download and staging, MyVxRail can provide LCM of clusters at scale. Customers can perform planning operations to gauge readiness before staging the update bundle and schedule the maintenance window. When the time comes, customers can initiate the cluster update for multiple clusters in a single workflow. Customer can customize the update path for each cluster. A time estimate, based on telemetry data gathered about the VxRail install base, is provided for each update path. A credentials manager further streamlines cluster update at scale by automating infrastructure credentials input needed to execute the operation. Cluster update requires a fee-based add-on license, SaaS active multi-cluster management for HCI System Software, that is applied each node in the cluster.
Figure 14. MyVxRail Updates tab
The cloud-based multi-cluster management capabilities are also available in CloudIQ. Adding VxRail visibility to CloudIQ allows users to view and monitor all their Dell infrastructure from a single web portal. In CloudIQ, users can benefit from global visualization and single-system views of their VxRail clusters from the System Health, Inventory, Capacity, and Performance sections. Simplified health scores, capacity forecasting, and performance graphs of VxRail clusters are available in CloudIQ.
A virtualization view, as shown in the following figure, organizes the VxRail cluster information, similar to the vCenter Server experience, for easier navigation.
Figure 15. CloudIQ virtualization view
CloudIQ users can initiate the same lifecycle management operations that are available in MyVxRail. Credentials and role-based access using the vCenter have also been added to CloudIQ to facilitate streamlined multi-cluster updates and ensure appropriate use of lifecycle management operations. CloudIQ uses the same add-on license that is used to entitle system updates for VxRail clusters on MyVxRail.
Figure 16. CloudIQ System Updates tab