VxRail and SmartDPUs—A Transformative Data Center Pairing
Tue, 17 Jan 2023 21:15:58 -0000|
Read Time: 0 minutes
What VMware announced as Project Monterey back in September 2020 has finally come to fruition as the vSphere Distributed Services Engine. While that name may seem like a mouthful, it really does a great job of describing what the product does in just a few words.
vSphere Distributed Services Engine provides a consistent management and services platform to deliver dynamic hardware for the growing world of agile distributed applications. The vSphere Distributed Services Engine will, in the future, be the engine upon which non-hypervisor vSphere services run. Today it begins with NSX, but VMware has set its sights on moving vSAN storage and host management services to the vSphere Distributed Services Engine, thus freeing up x86 resources for virtual machines, containers, and the applications they support.
Powering vSphere Distributed Services Engine is a new type of PCIe card known as a data processing unit (DPU), currently available from NVIDIA and AMD. At Dell, we are calling them SmartDPUs, as these PCIe cards and the software they run are the cornerstone of tomorrow’s disaggregated cloud-native data center.
From a hardware perspective, it would be easy to assume that a SmartDPU is just a fancy network card; after all, the most distinguishing external features are the SFP ports. But hiding under the large heatsink is a small powerful server with its own processor, memory, and storage. Most importantly is the programmable hardware I/O accelerator, the core of the SmartDPU that will deliver performance. The PowerEdge server team at Dell has gone a step further. They’ve tightly coupled the SmartDPUs with the existing PowerEdge iDRAC through the serial port and side-band server communication connections, bypassing the RJ45 management port. This allows the iDRAC to not only manage the PowerEdge server, but also to manage the SmartDPUs. As awesome as the hardware is, it needs software for its power to be unleashed.
This is where vSphere Distributed Services Engine comes into play. In this initial release, VMware is moving NSX and the networking and security services that it provides to the vSphere environment from the main x86 CPU and onto the SmartDPU. This provides several benefits: The most obvious is that this will free up x86 CPU resources for virtual machine and container workloads. Less obvious is the creation of an air gap between the NSX services and ESXi, enabling zero trust security. Does this mean that SmartDPUs are just an NSX offload card? Yes and no. VMware and the industry are taking the first small steps in what will be a leap forward for data center infrastructure and design. Future steps by VMware will expand the vSphere Distributed Services Engine to have storage and host management services running on the SmartDPUs, thus leaving the x86 CPU to run host virtual machines and containers.
VMware’s journey does not stop there, and these steps may seem blasphemous at first, but VMware will provide bare metal support, enabling Linux or Windows to be deployed on the x86 hardware. VMware acknowledges that not every workload is suited to run on vSphere, but that these workloads would benefit from the security, networking, and storage services provided by the vSphere Distributed Services Engine—transforming the data center, breaking down silo walls, distributing and aggregating any and all workloads.
Where does VxRail fit in all this? In the same place as we always have: Standing on the shoulders of the PowerEdge and VMware giants looking to remove complexity and friction from technology, making it easier and simpler for you to purchase, deploy, manage, update, and most importantly use this transformative technology. Freeing up your cycles to refactor your applications to meet the ever-growing needs of your business. VxRail will be supporting the vSphere Distributed Services Engine with the AMD Pensando and NVIDIA Bluefield-2 SmartDPUs on our core platforms—the E660F, V670F, and P670N. These nodes will be available in configurations for both VxRail with vSAN original storage architecture and VxRail dynamic nodes.
The journey of the modern data center is complex and ever changing, but with VxRail at your side you are in good company.
Author: David Glynn, Sr. Principal Engineer, VxRail Technical Marketing
Related Blog Posts
Update to VxRail 7.0.100 and Unleash the Performance Within It
Thu, 05 Nov 2020 23:07:52 -0000|
Read Time: 0 minutes
What could be better than faster storage? How about faster storage, more capacity, and better durability?
Last week at Dell Technologies we released VxRail 7.0.100. This release brings support for the latest versions of VMware vSphere and vSAN 7.0 Update 1. Typically, in an update release we will see a new feature or two, but VMware out did themselves and crammed not only a load of new or significantly enhanced features into this update, but also some game changing performance enhancements. As my peers at VMware already did a fantastic job of explain these features, I won’t even attempt to replicate their work – you can find links to the blogs on features that caught my attention in the reference section below. Rather, I want to draw attention to the performance gains, and ask the question: Could RAID5 with compression only be the new normal?
Don’t worry, I can already hear the cries of “Max performance needs RAID1, RAID5 has IO amplification and parity overhead, data reduction services have drawbacks”, but bear with me a little. Also, I’m not suggesting that RAID5 compression only be used for all workloads, there are some workloads that are definitely unsuitable – streams of compressed video come to mind. Rather I’m merely suggesting that after our customers go through the painless process of updating their cluster to VxRail 7.0.100 from one of our 36 previous releases in over the past two years (yes you can leap straight from 4.5.211 to 7.0.100 in a single update and yes we do handle the converging and decommissioning of the Platform Services Controller), that they check out the reduction in storage IO latency that their existing workload is putting on their VxRail cluster, and investigate what it represents – in short, more storage performance headroom.
As customers buy VxRail clusters to support production workloads, they can’t exactly load it up with a variety of benchmark workload test to see how far they can push it. But at VxRail we are fortune to have our own dedicated performance team, who have enough VxRail nodes to run a mid-sized enterprise, and access to a large library of components so that they can replicate almost any VxRail configuration we sell (and a few we don’t). So, there is data behind my outrageous suggestion, it isn’t just back of the napkin mathematics. Grab a copy of the performance team’s recent findings in their whitepaper: Harnessing the performance of Dell EMC VxRail 7.0.100: A lab based performance analysis of VxRail, and skip to figure 3. There you’ll find some very telling before and after performance latency curves with and without data reduction services for an RDBMS workload. Spoiler: 58% more peak IOPS and almost 40% lower latency, with compression this only drops to a still very significant 45% more peak IOPS with 30% lower latency. (For those of you screaming “but failure domains” check out the blog Space Efficiency Using the New “Compression only” Option where Pete Kohler explains the issue, and how it not longer exists with compression only.) But what about RAID5? Skip up to figure 1 which summarizes the across the board performance gains for IOPS and throughput, impressive, right? Now slide down to figure 2 to compare the throughput, in particular compare RAID 1 on 7.0 with RAID 5 on 7.0 U1 – the read performance is almost identical, while the gap in write performance has narrowed. Write performance on RAID5 will likely always lag RAID1 due to IO amplification, but VMware is clearly looking to narrow that gap as much as possible.
If nothing else the whitepaper should tell you that a simple hassle-free upgrade to VxRail 7.0.100 will unlock additional performance headroom on your vSAN cluster without any additional costs, and that the tradeoffs associated with RAID5 and data reduction services (compression only) are greatly reduced. There are opportunistic space savings to be had from compression only, but they require committing to a cluster wide change to unlock, which is something that should not be taken lightly. However, realizing the guaranteed 33% capacity savings of RAID5, can be unlocked per virtual machine, reverted just as easily, represents a lower risk. I opened asking the question if RAID5 with compression only could be the new normal, and I firmly believe that the data indicates that this is a viable option for many more workloads.
My peers at VMware (John Nicholson, Pete Flecha (these two are the voices and brains behind the vSpeakingPodcast – definitely worth listening to), Teodora Todorova Hristov, Pete Koehler and Cedric Rajendran) have written great and in-depth blogs about these features that caught my attention:
vSAN HCI Mesh – eliminate stranded storage by enabling VMs registered to cluster A access storage from cluster B
Shared Witness for 2-Node Deployments - reduced administration time and infrastructure costs thru one witness for up to sixty-four 2-node clusters
Enhanced vSAN File Services – adds SMB v2.1 and v3 for Windows and Mac clients. Add Kerberos authentication for existing NFS v4.1
Space Efficiency: Compression only option - For demanding workloads that cannot take advantage of deduplication. Compression only has higher throughput, lower latency, and significantly reduced impact on write performance compared to deduplication and compression. Compression only has a reduced failure domain and 7x faster data rebuild rates.
Spare Capacity Management – Slack space guidance of 25% has been replaced with a calculated Operational Reserve the requires less space, and decreases with scale. Additional option to enable Host rebuild reserve, VxRail Sizing Tool reserves this by default when sizing configurations, with the filter Add HA
Enhanced Durability during Maintenance Mode – data being intended for a host in maintenance mode is temporally recorded in a delta file on another host, providing the configured FTT during Maintenance Mode operations
Learn About the Latest Major VxRail Software Release: VxRail 8.0.000
Mon, 09 Jan 2023 14:45:15 -0000|
Read Time: 0 minutes
Happy New Year! I hope you had a wonderful and restful holiday, and you have come back reinvigorated. Because much like the fitness centers in January, this VxRail blog site is going to get busy. We have a few major releases in line to greet you, and there is much to learn.
First in line is the VxRail 8.0.000 software release that provides introductory support for VMware vSphere 8, which has created quite the buzz these past few months. Let’s walk through the highlights of this release.
- For VxRail users who want to be early adopters of vSphere 8, VxRail 8.0.000 provides the first upgrade path for VxRail clusters to transition to VMware’s latest vSphere software train. Only clusters with VxRail nodes based on either the 14th generation or 15th generation PowerEdge servers can upgrade to vSphere 8, because VMware has removed support for a legacy BIOS driver used by 13th generation PowerEdge servers. Importantly, users need to upgrade their vCenter Server to version 8.0 before a cluster upgrade, and vSAN 8.0 clusters require users to upgrade their existing vSphere and vSAN licenses. In VxRail 8.0.000, the VxRail Manager has been enhanced to check platform compatibility and warn users of license issues to prevent compromised situations. Users should always consult the release notes to fully prepare for a major upgrade.
- VxRail 8.0.000 also provides introductory support for vSAN Express Storage Architecture (ESA), which has garnered much attention for its potential while eliciting just as much curiosity because of its newness. To level set, vSAN ESA is an optimized version of vSAN that exploits the full potential of the very latest in hardware, such as multi-core processing, faster and larger capacity memory, and NVMe technology to unlock new capabilities to drive new levels of performance and efficiency. You can get an in-depth look at vSAN ESA in David Glynn’s blog. It is important to note that vSAN ESA is an alternative, optional vSAN architecture. The existing architecture (which is now referred to as Original Storage Architecture (OSA)) is still available in vSAN 8. It’s a choice that users can make on which one to use when deploying clusters.
In order to deploy VxRail clusters with vSAN ESA, you need to order brand-new VxRail nodes specifically configured for vSAN ESA. This new architecture eliminates the use of discrete cache and capacity drives. Nodes will require all NVMe storage drives. Each drive will contribute to cache and capacity. VxRail 8.0.000 offers two choices for platforms: E660N and the P670N. The user will select either the 3.2 TB or 6.4 TB TLC NVMe storage drives to populate each node in their new VxRail cluster with vSAN ESA. To learn about the configuration options, see David Glynn’s blog.
- The support in vSphere 8 in VxRail 8.0.000 also includes support for the increased cache size for VxRail clusters with vSAN 8.0 OSA. The increase from 600 TB to 1.6 TB will provide significant performance gain. VxRail already has cache drives that can take advantage of the larger cache size. It is easier to deploy a new cluster with a larger cache size than for an existing cluster to expand the current cache size. (For existing clusters, nodes need their disk groups rebuilt when the cache is expanded. This can be a lengthy and tedious endeavor.)
Major VMware releases like vSphere 8 often shine a light on the differentiated experience that our VxRail users enjoy. The checklist of considerations only grows when you’re looking to upgrade to a new software train. VxRail users have come to expect that VxRail provides them the necessary guardrails to guide them safely along the upgrade path to reach their destination. The 800,000 hours of test run time performed by our 100+ staff members, who are dedicated to maintaining the VxRail Continuously Validated States, is what gives our customers the confidence to move fearlessly from one software version to the next. And for customers looking to explore the potential of vSAN ESA, the partnership between VxRail and VMware engineering teams adds to why VxRail is the fastest and most effective path for users to maximize the return on their investment in VMware’s latest technologies.
If you’re interested in upgrading to VxRail 8.0.000, please read the release notes.
If you’re looking for more information about vSAN ESA and VxRail’s support for vSAN ESA, check out this blog.
Author: Daniel Chiu