2nd Gen AMD EPYC now available to power your favorite hyperconverged platform ;) VxRail
Mon, 27 Jul 2020 18:46:53 -0000|
Read Time: 0 minutes
Expanding the range of VxRail choices to include 64-cores of 2nd Gen AMD EPYC compute
Last month, Dell EMC expanded our very popular E Series (the E for Everything Series) with the introduction of the E665/F/N, our very first VxRail with an AMD processor, and what a processor it is! The 2nd Gen AMD EPYC processor came to market with a lot of industry-leading capabilities:
- Up to 64-cores in a single processor with 8, 12, 16, 24, 32 or 48 core offerings also available
- Eight memory channels, but not only more channels, they are also faster at 3200MT/s. The 2nd Gen EPYC can also address much more memory per processor
- 7nm transistors. Smaller transistors mean more powerful and more energy efficient processors
- Up to 128 lanes of PCIe Gen 4.0, with 2X the bandwidth of PCIe Gen 3.0.
These industry leading capabilities enable the VxRail E665 series to deliver dual socket performance in a single socket model - and can provide up to 90% greater general-purpose CPU capacity than other VxRail models when configured with single socket processors.
So, what is the sweet spot or ideal use case for the E665? As always, it depends on many things. Unlike the D Series (our D for Durable Series) that we also launched last month, which has clear rugged use cases, the E665 and the rest of the E Series very much live up to their “Everything” name, and perform admirably in a variety of use cases.
While the 2nd Gen EPYC 64-core processors grab the headlines, there are multiple AMD processor options, including the 16 core AMD 7F52 at 3.50GHz with a max boost of 3.9GHz for applications that benefit from raw clock speed, or where application licensing is core based. On the topic of licensing, I would be remiss if I didn’t mention VMware’s update to its per-CPU pricing earlier this year. This results in processors with more then 32-cores requiring a second VMware per-CPU license. This may make a 32-core processor an attractive option from an overall capacity & performance verses hardware & licensing cost perspective.
Speaking of overall costs, the E665 has dual 10Gb RJ45/SFP+ or dual 25Gb SFP28 base networking options, which can be further expanded with PCIe NICs including a dual 100Gb SFP28 option. From a cost perspective, the price delta between 10Gb and 25Gb networking is minimal. This is worth considering particularly for greenfield sites and even for brownfield sites where the networking maybe upgraded in the near future. Last year, we began offering Fibre Channel cards on VxRail, which are also available on the E665. While FC connectivity may sound strange for a hyperconverged infrastructure platform, it does make sense for many of our customers who have existing SAN infrastructure, or some applications (PowerMax for extremely large database requiring SRDF) or storage needs (Isilon for large file repository for medical files) that are more suited to SAN. While we’d prefer these SAN to be a Dell EMC product, as long as it is on the VMware SAN HCL, it can be connected. Providing this option enables customers to get the best both worlds have to offer.
The options don’t stop there. While the majority of VxRail nodes are sold with all-flash configurations, there are customers whose needs are met with hybrid configs, or who are looking towards all-NVMe options. The E665 can be configured with as little as 960GB to maximums of 14TB hybrid, 46TB all-flash, or 32TB all-NVMe of raw storage capacity. Memory options consist of 4, 8, or 16 RDIMMs of 16GB, 32GB or 64GB in size. Maximum memory performance, 3200 MT/s, is achieved with one DIMM per memory channel, adding a second matching DIMM reduces bandwidth slightly to 2933 MT/s.
VxRail and Dell Technologies, very much recognize that the needs of our customers vary greatly. A product with a single set of options cannot meet all our various customers’ different needs. Today, VxRail offers six different series, each with a different focus:
- Everything E Series a power packed 1U of choice
- Performance-focused P Series with dual or quad socket options
- VDI-focused V Series with a choice of five different NIVIDA GPUs
- Durable D Series are MIL-STD 810G certified for extreme heat, sand, dust, and vibration
- Storage-dense S Series with 96TB of hybrid storage capacity
- General purpose and compute dense G Series with 228 cores in a 2U form factor
With the highly flexible configuration choices, there is a VxRail for almost every use case, and if there isn’t, there is more than likely something in the broad Dell Technologies portfolio that is.
Author: David Glynn, Sr. Principal Engineer, VxRail Tech Marketing
Related Blog Posts
More GPUs, CPUs and performance - oh my!
Mon, 14 Jun 2021 11:18:50 -0000|
Read Time: 0 minutes
Continuous hardware and software changes deployed with VxRail’s Continuously Validated State
A wonderful aspect of software-defined-anything, particularly when built on world class PowerEdge servers, is speed of innovation. With a software-defined platform like VxRail, new technologies and improvements are continuously added to provide benefits and gains today, and not a year or so in the future. With the release of VxRail 7.0.200, we are at it again! This release brings support for VMware vSphere and vSAN 7.0 Update 2, and for new hardware: 3rd Gen AMD EPYC processors (Milan), and more powerful hardware from NVIDIA with their A100 and A40 GPUs.
VMware, as always, does a great job of detailing the many enhanced or new features in a release. From high level What’s New corporate or personal blog posts, to in-depth videos by Duncan Epping. However, there are a few changes that I want to highlight:
Get thee to 25GbE: A trilogy of reasons - Storage, load-balancing, and pricing.
vSAN is a distributed storage system. To that end, anything that improves the network or networking efficiency improves storage performance and application performance -- but there is more to networking than big, low-latency pipes. RDMA has been a part of vSphere since the 6.5 release; it is only with 7.0 Update 2 that it is leveraged by vSAN. John Nicholson explains the nuts and bolts of vSAN RDMA in this blog post, but only touches on the performance gains. From our performance testing on VxRail, I can share with you the gains we have seen with VxRail: up to 5% reduction in CPU utilization, up to 25% lower latency, and up to 18% higher IOPS, along with increases in read and write throughput. It should be noted that even with medium block IO, vSAN is more than capable of saturating a 10GbE port, RDMA is pushing performance beyond that, and we’ve yet to see what Intel 3rd Generation Xeon processors will bring. The only fly in the ointment for vSAN RDMA is the current small list of approved network cards – no doubt more will be added soon.
vSAN is not the only feature that enjoys large low-latency pipes. Niels Hagoort describes the changes in vSphere 7.0 Update 2 that have made vMotion faster, thus making Balancing Workloads Invisible and the lives of virtualization administrators everywhere a lot better. Aside: Can I say how awesome it is to see VMware continuing to enhance a foundational feature that they first introduced in 2003, a feature that for many was that lightbulb Aha! moment that started their virtualization journey.
One last nudge: pricing. The cost delta between 10GbE and 25GbE network hardware is minimal, so for greenfield deployments the choice is easy; you may not need it today, but workloads and demands continue to grow. For brownfield, where the existing network is not due for replacements, the choice is still easy. 25GbE NICs and switch ports can negotiate to 10GbE making a phased migration, VxRail nodes now and switches in the future, possible. The inverse is also possible: upgrade the network to 25GbE switches while still connecting your existing VxRail 10GbE SFP+ NIC ports.
Is 25GbE in your infrastructure upgrade plans yet? If not, maybe it should be.
A duo of AMD goodness
Last year we released two AMD-based VxRail platforms, the E665/F and the P675F/N, so I’m delighted to see CPU scheduler optimizations for AMD EPYC processors, as described in Aditya Sahu blog post. What is even better is the 29 page performance study Aditya links to, the depth of detail provided on how the ESXi CPU scheduling works, and didn’t work, with AMD EYPC processors is truly educational. The extensive performance testing VMware continuously runs and the results they share (spoiler: they achieved very significant gains) are also a worthwhile read. In our testing we’ve seen that with just these scheduler optimizations AMD alone VxRail 7.0.200 can provide up to 27% more IOPS and up to 27% lower latency for both RAID1 and RAID5 with relational database (RDBMS22K 60R/40W 100%Random) workloads.
VxRail begins shipping the 3rd generation AMD EYPC processors – also known as
Milan – in VxRail E665 and P675 nodes later this month. These are not a replacement
for the current 2nd Gen EPYC processors we offer, rather the addition of higher
performing 24-core, 32-core, and 64-core choices to the VxRail line up delivering up to 33% more IOPS and 16% lower latency across a range of workloads and block sizes. Check out this VMware blog post for the performance gains they showcase with the VMmark benchmarking tool.
HCI Mesh – only recently introduced, yet already getting better
When VMware released HCI Mesh just last October, it enabled stranded storage on one VxRail cluster to be consumed by another VxRail cluster. With the release of VxRail 7.0.200 this has been expanded to making it more applicable to more customers by enabling any vSphere clusters to also be consumers of that excess storage capacity – these remote clusters do not require a vSAN license and consume the storage in the same manner they would any other datastore. This opens up some interesting multi-cluster use cases, for example:
In solutions where a software application licensing requires each core/socket in the vSphere cluster to be licensed, this licensing cost can easily dwarf other costs. Now this application can be deployed on a small compute-only cluster, while consuming storage from the larger VxRail cluster. Or where the density of storage per socket didn’t make VxRail viable, it can now be achieved with a smaller VxRail cluster, plus a separate compute-only cluster. If only the all the goodness that is VxRail was available in a compute-only cluster – now that would be something dynamic…
A GPU for every workload
GPUs, once the domain of PC gamers, are now a data center staple with their parallel processing capabilities accelerating a variety of workloads. The versatile VxRail V Series has multiple NVIDIA GPUs to choose from and we’ve added two more with the addition of the NVIDIA A40 and A100. The A40 is for sophisticated visual computing workloads – think large complex CAD models, while the A100 is optimized for deep learning inference workloads for high-end data science.
Evolution of hardware in a software-defined world
PowerEdge took a big step forward with their recent release built on 3rd Gen Intel Xeon Scalable processors. Software-defined principles enable VxRail to not only quickly leverage this big step forward, but also to quickly leverage all the small steps in hardware changes throughout a generation. Building on the latest PowerEdge servers we are Reimagine HCI with VxRail with the next generation VxRail E660/F, P670F or V670F. Plus, what’s great about VxRail is that you can seamlessly integrate this latest technology into your existing infrastructure environment. This is an exciting release, but equally exciting are all the incremental changes that VxRail software-defined infrastructure will get along the way with PowerEdge and VMware.
VxRail, flexibility is at its core.
- VxRail systems with Intel 3rd Generation Xeon processors will be globally available in July 2021.
- VxRail systems with AMD 3rd Generation EPYC processors will be globally available in June 2021.
- VxRail HCI System Software updates will be globally available in July 2021.
- VxRail dynamic nodes will be globally available in August 2021.
- VxRail self-deployment options will begin availability in North America through an early access program in August 2021.
- Blog: Reimagine HCI with VxRail
- Attend our launch webinar to learn more.
- Press release: Dell Technologies Reimagines Dell EMC VxRail to Offer Greater Performance and Storage Flexibility
I feel the need – the need for speed (and endurance): Intel Optane edition
Tue, 12 Oct 2021 21:38:31 -0000|
Read Time: 0 minutes
It has only been three short months since we launched VxRail on 15th Generation PowerEdge, but we're already expanding the selection of configuration offerings. So far we've added 18 additional processors to power your workloads, including some high frequency and low core count options. This is delightful news for those with applications that are licensed per core, an additional NVIDIA GPU - the A30, a slew of additional drives, and doubled the RAM capacity to 8TB. I've probably missed something, as it can be hard to keep up with the all the innovations taking place within this race car that is VxRail!
In my last blog, I hinted at one of those drive additions, faster cache drives. Today I'm excited to announce that you can now order, and turbo charge your VxRail with the 400GB or 800GB Intel P5800X – Intel’s second generation Optane NVMe drive. Before we delve into some of the performance numbers, let’s discuss what it is about the Optane drives that makes them so special. More specifically, what is it about them that enables them to deliver so much more performance, in addition to significantly higher endurance rates.
To grossly over-simplify it, and my apologies in advance to the Intel engineers who poured their lives into this, when writing to NAND flash an erase cycle needs to be performed before a write can be made. These erase cycles are time-consuming operations and the main reason why random write IO capabilities on NAND flash is often a fraction of the read capability. Additionally, a garbage collection is running continuously in the background to ensure that there is space available to incoming writes. Optane, on the other hand, does bit-level write in place operations, therefore it doesn’t require an erase cycle, garbage collection, or performance penalty writes. Hence, random write IO capability almost matches the random read IO capability. So just how much better is endurance with this new Optane drive? Endurance can be measured in Drive Writes Per Day (DWPD), which measures how many times the drive's entire size could be overwritten each day of its warranty life. For the 1.6TB NVMe P5600 this is 3 DWPD, or 55 MB per second, every second for five years – just shy of 9PB of writes, not bad. However, the 800GB Optane P5800X will endure 146PB over its five-year warranty life, or almost 1 GB per second (926 MB/s) every second for its five year 100 DWPD warranty life. Not quite indestructible, but that is a lot of writes, so much so you don’t need extra capacity for wear leveling and a smaller capacity drive will suffice.
You might wonder why you should care about endurance, as Dell EMC will replace the drive under warranty anyway – there are three reasons. When a cache drive fails, its diskgroup is taken offline, so not only have you lost performance and capacity, your environment is taking on the additional burden of a rebuild operation to re-protect your data. Secondly, more and more systems are being deployed outside of the core data center. Replacing a drive in your data center is straightforward, and you might even have spares onsite, but what about outside of your core datacenter? What is your plan for replacing a drive at a remote office, or a thousand miles away? What if that remote location is not an office but an oilrig one hundred miles offshore, or a cruise ship halfway around the world where the cost of getting a replacement drive there is not trivial? In these remote locations, onsite spares are commonplace, but the exceptions are what lead me to the third reason, Murphy's Law. IT and IT staffing might be an afterthought at these remote locations. Getting a failed drive swapped out at a remote location which lacks true IT staffing may not get the priority it deserves, and then there is that ever present risk of user error... “Oh, you meant the other drive?!? Sorry...”
Cache in its many forms plays an important role in the datacenter. Cache enables switches and storage to deliver higher levels of performance. On VxRail, our cache drives fall into two categories, SAS and NVMe, with NVMe delivering up to 35% higher IOPS and 14% lower latency. Among our NVMe cache drive we have two from Intel, the 1.6TB P5600 and the Optane P5800X, in 400GB and 800GB capacities. The links for each will bring you to the drive specification including performance details. But how does the performance at a drive level impact performance at the solution level? Because, at the end of the day that is what your application consumes at the solution level, after cache mirroring, network hops, and the vSAN stack. Intel is a great partner to work with, when we checked with them about publishing solution level performance data comparing the two drives side-by-side, they were all for it.
In my over-simplified explanation above, I described how the write cycle for Optane drives is significantly different as an erase operation and does not need to be done first. So how does that play out in a full solution stack? Figure 1 compares a four node VxRail P670F cluster, running a 100% sequential write 64KB workload. Not a test that reflects any real-world workload, but one that really stresses the vSAN cache layer, highlights the consistent write performance that 3D XPoint technology delivers, and shows how Optane is able to de-stage cache when it fills up without compromising performance.
Figure 1: Optane cache drives deliver consistent and predictable write performance
When we look at performance, there are two numbers to keep in mind: IOPS and latency. The target is to have high IOPS with low and predictable latency, at a real-world IO size and read:write ratio. To that end, let’s look at how VxRail performance differs with the P5600 and P5800X under OLTP32K (70R30W) and RDBMS (60R40W) benchmark workload, as shown in Figure 2.
Figure 2: Optane cache drives deliver higher performance and lower latency across a variety of workload types.
It doesn't take an expert to see that with the P5800X this four node VxRail P670F cluster's peak performance is significantly higher than when it is equipped with the P5600 as a cache drive. For RDBMS workloads up to 44% higher IOPS with a 37% reduction in latency. But peak performance isn't everything. Many workloads, particularly databases, place a higher importance on latency requirements. What if our workload, database or otherwise, requires 1ms response times? Maybe this is the Service Level Agreement (SLA) that the infrastructure team has with the application team. In such a situation, based on the data shown, and for a OLTP 70:30 workload with a 32K block size, the VxRail cluster would deliver over twice the performance at the same latency SLA, going from 147,746 to 314,300 IOPS.
In the datacenter, as in life, we are often faced with "Good, fast, or cheap. Choose two." When you compare the price tag of the P5600 and P5800X side by side, the Optane drive has a significant premium for its good and fast. However, keep in mind that you are not buying an individual drive, you are buying a full stack solution of several pieces of hardware and software, where the cost of the premium pales in comparison to the increased endurance and performance. Whether you are looking to turbo charge your VxRail like a racecar, or make it as robust as a tank, Intel Optane SSD drives will get you both.
David Glynn, Technical Marketing Engineer, VxRail at Dell Technologies
LinkedIn: David Glynn