I feel the need – the need for speed (and endurance): Intel Optane edition
Tue, 12 Oct 2021 21:38:31 -0000|
Read Time: 0 minutes
It has only been three short months since we launched VxRail on 15th Generation PowerEdge, but we're already expanding the selection of configuration offerings. So far we've added 18 additional processors to power your workloads, including some high frequency and low core count options. This is delightful news for those with applications that are licensed per core, an additional NVIDIA GPU - the A30, a slew of additional drives, and doubled the RAM capacity to 8TB. I've probably missed something, as it can be hard to keep up with the all the innovations taking place within this race car that is VxRail!
In my last blog, I hinted at one of those drive additions, faster cache drives. Today I'm excited to announce that you can now order, and turbo charge your VxRail with the 400GB or 800GB Intel P5800X – Intel’s second generation Optane NVMe drive. Before we delve into some of the performance numbers, let’s discuss what it is about the Optane drives that makes them so special. More specifically, what is it about them that enables them to deliver so much more performance, in addition to significantly higher endurance rates.
To grossly over-simplify it, and my apologies in advance to the Intel engineers who poured their lives into this, when writing to NAND flash an erase cycle needs to be performed before a write can be made. These erase cycles are time-consuming operations and the main reason why random write IO capabilities on NAND flash is often a fraction of the read capability. Additionally, a garbage collection is running continuously in the background to ensure that there is space available to incoming writes. Optane, on the other hand, does bit-level write in place operations, therefore it doesn’t require an erase cycle, garbage collection, or performance penalty writes. Hence, random write IO capability almost matches the random read IO capability. So just how much better is endurance with this new Optane drive? Endurance can be measured in Drive Writes Per Day (DWPD), which measures how many times the drive's entire size could be overwritten each day of its warranty life. For the 1.6TB NVMe P5600 this is 3 DWPD, or 55 MB per second, every second for five years – just shy of 9PB of writes, not bad. However, the 800GB Optane P5800X will endure 146PB over its five-year warranty life, or almost 1 GB per second (926 MB/s) every second for its five year 100 DWPD warranty life. Not quite indestructible, but that is a lot of writes, so much so you don’t need extra capacity for wear leveling and a smaller capacity drive will suffice.
You might wonder why you should care about endurance, as Dell EMC will replace the drive under warranty anyway – there are three reasons. When a cache drive fails, its diskgroup is taken offline, so not only have you lost performance and capacity, your environment is taking on the additional burden of a rebuild operation to re-protect your data. Secondly, more and more systems are being deployed outside of the core data center. Replacing a drive in your data center is straightforward, and you might even have spares onsite, but what about outside of your core datacenter? What is your plan for replacing a drive at a remote office, or a thousand miles away? What if that remote location is not an office but an oilrig one hundred miles offshore, or a cruise ship halfway around the world where the cost of getting a replacement drive there is not trivial? In these remote locations, onsite spares are commonplace, but the exceptions are what lead me to the third reason, Murphy's Law. IT and IT staffing might be an afterthought at these remote locations. Getting a failed drive swapped out at a remote location which lacks true IT staffing may not get the priority it deserves, and then there is that ever present risk of user error... “Oh, you meant the other drive?!? Sorry...”
Cache in its many forms plays an important role in the datacenter. Cache enables switches and storage to deliver higher levels of performance. On VxRail, our cache drives fall into two categories, SAS and NVMe, with NVMe delivering up to 35% higher IOPS and 14% lower latency. Among our NVMe cache drive we have two from Intel, the 1.6TB P5600 and the Optane P5800X, in 400GB and 800GB capacities. The links for each will bring you to the drive specification including performance details. But how does the performance at a drive level impact performance at the solution level? Because, at the end of the day that is what your application consumes at the solution level, after cache mirroring, network hops, and the vSAN stack. Intel is a great partner to work with, when we checked with them about publishing solution level performance data comparing the two drives side-by-side, they were all for it.
In my over-simplified explanation above, I described how the write cycle for Optane drives is significantly different as an erase operation and does not need to be done first. So how does that play out in a full solution stack? Figure 1 compares a four node VxRail P670F cluster, running a 100% sequential write 64KB workload. Not a test that reflects any real-world workload, but one that really stresses the vSAN cache layer, highlights the consistent write performance that 3D XPoint technology delivers, and shows how Optane is able to de-stage cache when it fills up without compromising performance.
Figure 1: Optane cache drives deliver consistent and predictable write performance
When we look at performance, there are two numbers to keep in mind: IOPS and latency. The target is to have high IOPS with low and predictable latency, at a real-world IO size and read:write ratio. To that end, let’s look at how VxRail performance differs with the P5600 and P5800X under OLTP32K (70R30W) and RDBMS (60R40W) benchmark workload, as shown in Figure 2.
Figure 2: Optane cache drives deliver higher performance and lower latency across a variety of workload types.
It doesn't take an expert to see that with the P5800X this four node VxRail P670F cluster's peak performance is significantly higher than when it is equipped with the P5600 as a cache drive. For RDBMS workloads up to 44% higher IOPS with a 37% reduction in latency. But peak performance isn't everything. Many workloads, particularly databases, place a higher importance on latency requirements. What if our workload, database or otherwise, requires 1ms response times? Maybe this is the Service Level Agreement (SLA) that the infrastructure team has with the application team. In such a situation, based on the data shown, and for a OLTP 70:30 workload with a 32K block size, the VxRail cluster would deliver over twice the performance at the same latency SLA, going from 147,746 to 314,300 IOPS.
In the datacenter, as in life, we are often faced with "Good, fast, or cheap. Choose two." When you compare the price tag of the P5600 and P5800X side by side, the Optane drive has a significant premium for its good and fast. However, keep in mind that you are not buying an individual drive, you are buying a full stack solution of several pieces of hardware and software, where the cost of the premium pales in comparison to the increased endurance and performance. Whether you are looking to turbo charge your VxRail like a racecar, or make it as robust as a tank, Intel Optane SSD drives will get you both.
David Glynn, Technical Marketing Engineer, VxRail at Dell Technologies
LinkedIn: David Glynn
Related Blog Posts
Top benefits to using Intel Optane NVMe for cache drives in VxRail
Wed, 20 May 2020 14:42:17 -0000|
Read Time: 0 minutes
Performance, endurance, and all without a price jump!
There is a saying that “A picture paints a thousand words” but let me add that a “graph can make for an awesome picture”.
Last August we at VxRail worked with ESG on a technical validation paper that included, among other things, the recent addition of Intel Optane NVMe drives for the vSAN caching layer. Figure 3 in this paper is a graph showing the results of a throughput benchmark workload (more on benchmarks later). When I do customer briefings and the question of vSAN caching performance comes up, this is my go-to whiteboard sketch because on its own it paints a very clear picture about the benefit of using Optane drives – and also because it is easy to draw.
In the public and private cloud, predictability of performance is important, doubly so for any form of latency. This is where caching comes into play, rather than having to wait on a busy system, we just leave it in the write cache inbox and get an acknowledgment. The inverse is also true. Like many parents I read almost the same bedtime stories to my young kids every night, you can be sure those books remain close to hand on my bedside “read cache” table. This write and read caching greatly helps in providing performance and consistent latency. With vSAN all-flash there no longer any read cache as the flash drives at the capacity layer provide enough random read access performance… just as my collection of bedtime story books has been replaced with a Kindle full of eBooks. Back to the write cache inbox where we’ve been dropping things off – at some point, this write cache needs to be empty, and this is where the Intel Optane NVMe drives shine. Drawing the comparison back to my kids, I no longer drive to a library to drop off books. With a flick of my finger I can return, or in cache terms de-stage, books from my Kindle back to the town library - the capacity drives if you will. This is a lot less disruptive to my day-to-day life, I don’t need to schedule it, I don’t need to stop what I’m doing, and with a bit of practice I’ve been able to do this mid story Let’s look at this in actual IT terms and business benefits.
To really show off how well the Optane drives shine, we want to stress the write cache as much as possible. This is where benchmarking tools and the right knowledge of how to apply them come into play. We had ESG design and run these benchmarking workloads for us. Now let’s be clear, this test is not reflective of a real-world workload but was designed purely to stress the write cache, in particular the de-staging from cache to capacity. The workload that created my go-to whiteboard sketch was the 100% sequential 64KB workload with a 1.2TB working set per node for 75 minutes.
The graph clearly shows the benefit of the Optane drives, they keep on chugging at 2,500MB/sec of throughput the entire time without dropping a beat. What’s not to like about that! This is usually when the techie customer in the room will try to burst my bubble by pointing out the unrealistic workload that is in no way reflective of their environment, or most environments… which is true. A more real-world workload would be a simulated relational database workload with a 22KB block size, mixing random 8K and sequential 128K I/O, with 60% reads and 40% writes, and a 600GB per node working set, which is quite a mouthful and is shown in figure 5. The results there show a steady 8.4-8.8% increase in IOPS across the board and a slower rise in latency resulting in a 10.5% lower response time under 80% load.
Those of you running OLTP workloads will appreciate the graph shown in figure 6 where HammerDB was used to emulate the database activity of a typical online brokerage firm. The Optane cache drives under that workload sustained a remarkable 61% more transactions per minute (TPM) and new orders per minute (NOPM). That can result in significant business improvement for an online brokerage firm who adopts Optane drives versus one who is using NAND SSDs.
When it comes to write cache, performance is not everything, write endurance is also extremely important. The vSAN spec requires that cache drives be SSD Endurance Class C (3,650 TBW) or above, and Intel Optane beats this hands down with an over tenfold margin at 41 PBW (41,984 TBW). The Intel Optane 3D XPoint architecture allows memory cells to be individually addressed in a dense, transistor-less, stackable design. This extremely high write endurance capability has let us spec a smaller sized cache drive, which in turn lets us maintain a similar VxRail node price point, enabling you the customer to get more performance for your dollar.
What’s not to like? Typically, you get to pick any two; faster/better/cheaper. With Intel Optane drives in your VxRail you get all three; more performance and better endurance, at roughly the same cost. Wins all around!
Author: David Glynn, Sr Principal Engineer, VxRail Tech Marketing
It’s Been a Dell EMC VxRail Filled Summer
Wed, 15 Sep 2021 18:07:34 -0000|
Read Time: 0 minutes
Get your VxRail learn on with Tech Field Days and ESG
It has been a busy summer with the launch of our Next Gen VxRail nodes built on 15th Generation PowerEdge servers. This has included working with the fantastic people at ESG and Tech Field Day. Working with these top tier luminaries really forces us to distill our messaging to the key points – no small feat, particularly with so many new releases and enhancements.
If you are not familiar with Tech Field Days, they are “a series of invite-only technical meetings between delegates invited from around the world and sponsoring enterprise IT companies that share their products and ideas through presentations, demos, roundtables, and more”. The delegates are all hand-picked industry thought leaders – those wicked smart people you are following on Twitter – and they ask tough questions. Earlier this month, Dell Technologies spent two days with them: a day dedicated to storage, and a day for VxRail. You can catch the recordings from both days here: Dell Technologies HCI & Storage: Cutting Edge Infrastructure to Drive Your Business.
Of the twelve great jampacked VxRail sessions, if you cannot watch them all, do make time in your day for these three:
- VxRail dynamic nodes – Flexibility for the Future. In short, VxRail without vSAN. Leverage HCI Mesh, or FC SAN storage like PowerStore T or PowerMax, and bring that VxRail LCM goodness to other vSphere clusters in your data center.
- What Does “Seamless Technology Integration” Mean for VxRail? I hate tooting my own horn, but if you want a quick deep dive on where all the storage performance in VxRail is coming from, this is your 22-minute Cliff Notes version.
- Get the Most Out of Your K8s with Tanzu on VxRail This is your crash course about how VxRail can get Tanzu stood up, so you can get Kubernetes at the Speed of Cloud, within your data center.
One more suggestion, if you are new to VxRail, or on the fence about deploying VxRail, tune into this session from Adam Little, Senior Cybersecurity Administrator for New Belgium Brewing, and the reasons they selected VxRail. Even brewing needs high availability, redundancy, and simplicity.
ESG is an IT analyst, research, validation, and strategy firm whose staff is well known for their technical prowess and frankly are fun to work with. Maybe that is because they are techies at heart who love geeking out over new hardware. I got to work with Tony Palmer as he audited the results of our VxRail on 15th Generation PowerEdge performance testing. Tony went through things with a fine-tooth comb, and asked a lot of great (and tough) probing questions.
What was most interesting was how he looked at the same data but in a very different way – quickly zeroing in on how much performance VxRail could deliver at sub-millisecond latency, verses peak performance. Tony pointed out “It’s important to note that not too long ago, performance this high with sub-millisecond response times required a significant investment in specialized storage hardware”. Personally, I love this independent validation. It is one thing for our performance team to benchmark VxRail performance, but it is quite another for an analyst firm to audit our results and to be blown out of the water to the degree they were. Read their full Technical Validation of Dell EMC VxRail on 15th Generation PowerEdge Technology: Pushing the Boundaries of Performance and VM Density for Business- and Mission-critical Workloads], and then follow it up with some of their previous work on VxRail.
If performance concerns have been holding you back from putting your toes in the HCI waters, now is a great time to jump in. The 3rd Gen Intel® Xeon® Scalable processors are faster and have more cores, but also bring other hardware architectural changes. From a storage performance perspective, the most impactful of those is PCIe Gen4, with double the bandwidth of PCIe Gen 3 which was introduced in 2012.
From the OLTP16K (70/30) workload in the following figure, we can see that by just upgrading the vSAN cache drive to PCIe Gen 4, an additional 37% of performance can be unleashed. If that is not enough, enabling RDMA for vSAN nets an additional 21% of performance. One more thing, this is with only two diskgroups… check in with me later for when we crank performance up to 11 with four diskgroups, faster cache drives, and a few more changes.
With OLTP4K (70/30) peak IOPS performance clocking in at 155K with 0.853ms latency per node, VxRail can support workloads that demand the most of storage performance. But performance is not always the focus of storage.
If your workloads benefit from SAN data services such as PowerStore’s 4:1 Data Reduction or PowerMax’s SRDF, then now is a great time to learn about the VxRail Advantage and the benefits that VxRail Lifecycle Management provides. Check out Daniel Chiu’s blog post on VxRail dynamic nodes, where the power of the portfolio is delivering the best of both worlds.