Enhancing Satellite Node Management at Scale
Tue, 15 Mar 2022 20:04:50 -0000|
Read Time: 0 minutes
Satellite nodes are a great addition to the VxRail portfolio, empowering users at the edge, as described in David Glynn’s blog Satellite Nodes: Because sometimes even a 2-node cluster is too much. Although satellite nodes are still new, we’ve been working hard and have already started making improvements. Dell’s latest VxRail 7.0.350 release has a number of new VxRail enhancements and in this blog we’ll focus on these new satellite node features:
- Improved life cycle management (LCM)
- New APIs
- Improved security
The first way we’ve improved satellite nodes is by reducing the required maintenance window. To do this, the satellite node update process has now been split in two. Instead of staging the recovery bundle and performing the update in one step, you can now stage the recovery bundle and perform the update separately.
Staging the bundle in advance is great because we know bandwidth can be limited at the edge and this allows ample time to transfer the bundle in advance to ensure your update happens during your scheduled maintenance window. Once your bundles are staged, it’s as simple as scheduling the updates and letting VxRail execute the node update. This improvement ensures that you can complete the update within the expected timeframe to minimize downtime. Satellite nodes sit outside the cluster and, as a result, workloads will go offline while the node is updated.
Do you have a large number of edge locations that could use satellite nodes and need an easier way to manage at scale? Good news! These new APIs are perfect for making edge life at scale easier.
The new APIs include:
- Satellite node LCM
- Add a satellite node to a managed folder
- Remove a satellite node from a managed folder
The introductory release of VxRail satellite nodes featured LCM operations through the VxRail Manager plug-in, which could be quite time consuming if you are managing a large number of satellite nodes. We saw room for improvement so now administrators can use VxRail APIs to add, update, and remove satellite nodes to simplify and speed up operations.
You can use the satellite node LCM API to adjust configuration settings that benefit management at scale, such as adjusting the number of satellite nodes you want to update in parallel. For example, although the default is to update 20 nodes in parallel, you can initiate updates for up to 30 satellite nodes in parallel, as needed.
There is also a failure rate feature that will set a condition to exit from an LCM operation. For example, if you are updating multiple satellite nodes at one time and nodes are failing to update, the failure rate setting is a way to abort the operation altogether if the rate surpasses a set threshold. The default threshold is 20% but can be set anywhere from 1% to 100%. Using the VxRail API, you can adjust settings like this that are not available in the VxRail Manager.
These new APIs are great for users with a large number of VxRail satellite nodes. Adding, removing, and updating satellite nodes can now be automated through the new APIs, saving you precious time across your edge locations.
VxRail satellite nodes can now use Secure Enterprise Key Management (SEKM), made available through the Dell PowerEdge servers that VxRail is built on. What is SEKM you might ask? Well, SEKM gives you the ability to secure drive access using encryption keys stored on a central key management server (not on the satellite node).
SEKM is great for many reasons. First, an edge location might be more exposed and have less physical security than your typical data center but that doesn’t mean securing your data is any less important. SEKM keeps your data drives locked even if the entire server is stolen. When paired with self-encrypting drives, you can secure the data even further. Second, the encryption keys are stored in a centralized location, making it easier to manage the security of large numbers of satellite nodes instead of having to manage each satellite node individually.
In this blog we’ve highlighted some exciting new satellite node features, including an improved update process, new APIs, and enhanced security, all of which enhance managing the edge at scale. Check out the full VxRail 7.0.350 release and see the full list of enhancements by clicking the link below.
Thanks for reading!
Author: Stephen Graham, VxRail Tech Marketing
Related Blog Posts
Learn About the Latest Major VxRail Software Release: VxRail 7.0.400
Wed, 21 Sep 2022 13:04:04 -0000|
Read Time: 0 minutes
As many parts of the world welcome the fall season and the cooler temperatures that it brings, one area that has not cooled down is VxRail. The latest VxRail software release, 7.0.400, introduces a slew of new features that will surely fire up our VxRail customers and spur them to schedule their next cluster update.
VxRail 7.0.400 provides support for VMware ESXi 7.0 Update 3g and VMware vCenter Server 7.0 Update 3g. All existing platforms that support VxRail 7.0 can upgrade to VxRail 7.0.400. Upgrades from VxRail 4.5 and 4.7 are supported, which is an important consideration because standard support from Dell for those versions ends on September 30.
VxRail 7.0.400 software introduces features in the following areas:
- Life cycle management
- Dynamic nodes
- Configuration flexibility
This blog delves into major enhancements in those areas. For a more comprehensive rundown of the features added to this release, see the release notes.
Life cycle management
Because life cycle management is a key area of value differentiation for our VxRail customers, the VxRail team is continuously looking for ways to further enhance the life cycle management experience. One aspect that has come into recent focus is handling cluster update failures caused by VxRail nodes failing to enter maintenance mode.
During a cluster update, nodes are put into maintenance mode one at time. Their workloads are moved onto the remaining nodes in the cluster to maintain availability while the nodes go through software, firmware, and driver updates. VxRail 7.0.350 introduced capabilities to notify users of situations such as host pinning and mounted VM tools on the host that can cause nodes to fail to enter maintenance mode, so users can address those situations before initiating a cluster update.
VxRail 7.0.400 addresses this cluster update failure scenario even further by being smarter with how it handles this issue once the cluster update is in operation. If a node fails to enter maintenance mode, VxRail automatically skips that node and moves onto the next node. Previously, this scenario would cause the cluster update operation to fail. Now, users can run that cluster update and process as many nodes as possible. Users can then run a cluster update retry, which targets only the nodes that were skipped. The combination of skipping nodes and targeted retry of those skipped nodes significantly improves the cluster update experience.
Figure 1: Addressing nodes failing to enter maintenance mode
In VxRail 7.0.400, a Dell RecoverPoint for VMs compatibility check has been added to the update advisory report, cluster update pre-check, and cluster update operation to inform users of a potential incompatibility scenario. Having data protection in an unsupported state puts an environment at risk. The addition of the compatibility check is a great news for RecoverPoint for VMs users because this previously manual task is now automated, helping to reduce risk and streamline operations.
VxRail dynamic nodes
Since the introduction of VxRail dynamic nodes last year, we’ve incrementally added more storage protocol support for increased flexibility. NFS, CIFS, and iSCSI support were added earlier this year. In VxRail 7.0.400, users can configure their VxRail dynamic nodes with storage from Dell PowerStore using NVMe on Fabric over TCP (NVMe-oF/TCP). NVMe provides much faster data access compared to SATA and SAS. The support requires Dell PowerStoreOS 2.1 or later and Dell PowerSwitch with the virtual Dell SmartFabric Storage Service appliance.
VxRail cluster deployment using NVMe-oF/TCP is not much different from setting up iSCSI storage as the primary datastore for VxRail dynamic node clusters. The cluster must go through the Day 1 bring-up activities to establish IP connectivity. From there, the user can then set up the port group, VM kernels, and NVMe-oF/TCP adapter to access the storage shared from the PowerStore.
Setting up NVMe-oF/TCP between the VxRail dynamic node cluster and PowerStore is separate from the cluster deployment activities. You can find more information about deploying NVMe-oF/TCP here: https://infohub.delltechnologies.com/t/smartfabric-storage-software-deployment-guide/.
VxRail 7.0.400 also adds VMware Virtual Volumes (vVols) support for VxRail dynamic nodes. Cluster deployment with vVols over Fibre Channel follows a workflow similar to cluster deployment with a VMFS datastore. Provisioning and zoning of the Virtual Volume needs to be done before the Day 1 bring-up. The VxRail Manager VM is installed onto the datastore as part of the Day 1 bring-up.
For vVols over IP, the Day 1 bring-up needs to be completed first to establish IP connectivity. Then the Virtual Volume can be mounted and a datastore can be created from it for the VxRail Manager VM.
Figure 2: Workflow to set up VxRail dynamic node clusters with VMware Virtual Volumes
VxRail 7.0.400 introduces the option for customers to deploy a local VxRail managed vCenter Server with their VxRail dynamic node cluster. The Day 1 bring-up installs a vCenter Server onto the cluster with a 60-day evaluation license, but the customer is required to purchase their own vCenter Server license. VxRail customers are accustomed to having a Standard edition vCenter Server license packaged with their VxRail purchase. However, that vCenter Server license is bundled with the VMware vSAN license, not the VMware vSphere license.
VxRail 7.0.400 supports the use of Dell PowerPath/VE with VxRail dynamic nodes, which is important to many storage customers who have been relying on PowerPath software for multipathing capabilities. With VxRail 7.0.400, VxRail dynamic nodes can use PowerPath with PowerStore, PowerMax, or Unity XT storage array via NFS, iSCSI, or NVMe over Fibre Channel storage protocol.
Another topic that continues to burn bright, no matter the season, is security. As threats continue to evolve, it’s important to continue to advance security measures for the infrastructure. VxRail 7.0.400 introduces capabilities that make it even easier for customers to further protect their clusters.
While the security configuration rules set forth by the Security Technical Implementation Guide (STIG) are required for customers working in or with the U.S. federal government and Department of Defense, other customers can benefit from hardening their own clusters. VxRail 7.0.400 automatically applies a subset of the STIG rules on all VxRail clusters. These rules protect VM controls and the underlying SUSE Linux operating system controls. Application of the rules occurs without any user intervention upon an upgrade to VxRail 7.0.400 and at the cluster deployment with this software version, providing a seamless experience. This feature increases the security baseline for all VxRail clusters starting with VxRail 7.0.400.
Digital certificates are used to verify the external communication between trusted entities. VxRail customers have two options for digital certificates. Self-signed certificates use the VxRail as the certificate authority to sign the certificate. Customers use this option if they don’t need a Certificate Authority or choose not to pay for the service. Otherwise, customers can import a certificate signed by a Certificate Authority to the VxRail Manager. Both options require certificates to be shared between the VxRail Manager and vCenter Server for secure communication to manage the cluster.
Previously, both options required manual intervention, at varying levels, to manage certificate renewals and ensure uninterrupted communication between the VxRail Manager and the vCenter Server. Loss of communication can affect cluster management operations, though not the application workloads.
Figure 3: Workflow for managing certificates
With VxRail 7.0.400, all areas of managing certificates have been simplified to make it easier and safer to import and manage certificates over time. Now, VxRail certificates can be imported via the VxRail Manager and API. There’s an API to import the vCenter certificate into the VxRail trust store. Renewals can be managed automatically via the VxRail Manager so that customers do not need to constantly check expiring certificates and replace certificates. Alternatively, new API calls have been created to perform these activities. While these features simplify the experience for customers already using certificates, hopefully the simplified certificate management will encourage more customers to use it to further secure their environment.
VxRail 7.0.400 also introduces end-to-end upgrade bundle integrity check. This feature has been added to the pre-update health check and the cluster update operation. The signing certificate is verified to ensure the validity of the root certificate authority. The digital certificate is verified. The bundle manifest is also checked to ensure that the contents in the bundle have not been altered.
With any major VxRail software release comes enhancements in configuration flexibility. VxRail 7.0.400 provides more flexibility for base networking and more flexibility in using and managing satellite nodes.
Previous VxRail software releases introduced long-awaited support for dynamic link aggregation for vSAN and vSphere vMotion traffic and support for two vSphere Distributed Switches (VDS) to separate traffic management traffic from vSAN and vMotion traffic. VxRail 7.0.400 removes the previous port count restriction of four ports for base networking. Customers can now also deploy clusters with six or eight ports for base networking while employing link aggregation or multiple VDS, or both.
Figure 4: Two VDS with six NIC ports
Figure 5: Two VDS with eight NIC ports with link redundancy for vMotion traffic and link aggregation for vSAN traffic
With VxRail 7.0.400, customers can convert their vSphere Standard Switch on their satellite nodes to a customer-managed VDS after deployment. This support allows customers to more easily manage their VDS and satellite nodes at scale.
The most noteworthy serviceability enhancement I want to mention is the ability to create service tickets from the VxRail Manager UI. This functionality makes it easier for customers to submit service tickets, which can speed resolution time and improve the feedback loop for providing product improvement suggestions. This feature requires an active connection with the Embedded Service Enabler to Dell Support Services. Customers can submit up to five attachments to support a service ticket.
Figure 6: Input form to create a service request
VxRail 7.0.400 is no doubt one of the more feature-heavy VxRail software releases in some time. Customers big and small will find value in the capability set. This software release enhances existing features while also introducing new tools that further focus on VxRail operational simplicity. While this blog covers the highlights of this release, I recommend that you review the release notes to further understand all the capabilities in VxRail 7.0.400.
Protecting VxRail From Unplanned Power Outages: More Choices Available
Tue, 31 May 2022 12:36:51 -0000|
Read Time: 0 minutes
In my previous blog, Protecting VxRail from Power Disturbances, I described the first API-integrated solution that helps customers preserve data integrity on VxRail if there are unplanned power events. Today, I'm excited to introduce another solution that resulted from our close partnership with Schneider Electric (APC).
Why is it important?
Over the last few years, VxRail has become a critical HCI system and data-center building block for over 15,000 customers who have deployed more than 220,000 nodes globally. When HCI was first introduced, it was often considered for specific workloads such as VDI or ROBO locations. However, with the evolution of hardware and software capabilities, VxRail became a catalyst in data-center modernization, deployed across various use cases from core to cloud to edge. Today, customers are deploying VxRail for mission-critical workloads because it is powerful enough to meet the most demanding requirements for performance, capacity, availability, and rich data services.
Dell Technologies is a leader in data-protection solutions and offers a portfolio of products that can fulfill even the most demanding RPO and RTO requirements from customers. In addition to using traditional data-protection solutions, it is best practice to use a UPS to protect the infrastructure and ensure data integrity if there are unplanned power events. In this blog, I want to highlight a new solution from Schneider Electric, the provider of APC Smart-UPS systems.
The APC UPS protection solution for VxRail
Schneider Electric is one of Dell Technologies’ strategic partners in the Extended Technologies Complete Program. It provides Dell Technologies with APC UPS and IT rack enclosures offering a comprehensive solution set of infrastructure hardware, monitoring, management software, and service options.
PowerChute Network Shutdown in version 4.5 seamlessly integrates with VxRail by communicating over the network with the APC UPS. If there is a power outage, PowerChute can gracefully shut down VxRail clusters using the VxRail API. As a result of this integration, PowerChute can run on the same protected VxRail cluster, saving space and reducing hardware costs.
- VxRail cluster with VxRail HCI System Software version 7.0.320, 4.7.540 or higher
- Dell Smart-UPS Online 5kVA DLRT5KRMXLT or Dell Smart-UPS Online 3kVA DLRT3000RMXLA
- UPS Network Management Card 3 (AP9640, AP9640, or AP9643) with NMC firmware version v2.2 or higher
- Either a 1-Year or 3-Year PowerChute license for each VxRail node in the cluster (PowerChute Network Shutdown software version 4.5 or higher)
Key benefits of this solution include:
- Unattended, graceful shutdown of virtual machines (VMs), followed by the VxRail cluster that avoids data corruption thanks to integration with the VxRail API.
- Minimal downtime after critical events have passed with a pre-configured automated start-up sequence, which is useful at remote or unattended sites.
- Full deployment within the VxRail cluster saves space and reduces hardware requirements since you don't have to deploy PowerChute on a separate machine outside the cluster.
- Edge-ready with support for Edge-ready vSAN architecture with vSAN 2-node clusters.
- Redundant VxRail API-based cluster shutdown. In a redundant UPS set-up, if one NMC3 is offline, PowerChute will connect to one or more available NMC3s to carry out the VxRail cluster shutdown.
How does it work?
This is easiest to describe using the following diagram, which covers the steps taken in a power event and when the event is cleared:
How PowerChute Network Shutdown works with VxRail
I highly recommend watching the demo of this solution in action, which is listed in the Additional resources section at the end of this blog.
Protection against unplanned power events should be a part of a business continuity strategy for all customers who run their critical workloads on VxRail. This practice ensures data integrity by enabling automated and graceful shutdown of VxRail clusters. Customers now have more choice in providing such protection, with the new version of PowerChute Network Shutdown software for APC UPS systems integrated with VxRail API and validated with VxRail.
Solution brochure: PowerChute Network Shutdown v4.5 Brochure
Solution demo video: PowerChute Network Shutdown v4.5 VxRail Technical Demo
Previous blog: Protecting VxRail from Power Disturbances
Karol Boguniewicz, Senior Principal Engineering Technologist, Dell Technologies
LinkedIn: Karol Boguniewicz