IT Modernization with next-generation Dell PowerEdge Servers and 4th generation Intel® Xeon® ProcessorsPowerEdge R760 vSAN 8.0 testing with the Original Storage Architecture (OSA) PowerEdge R760 vSAN 8.0 testing with the Express Storage Architecture (ESA)
Thu, 03 Aug 2023 22:50:03 -0000|
Read Time: 0 minutes
When transitioning to a new Server Technology, customers must weigh the cost of the solution against the benefits it can provide. A “solution” requires a combination of Hardware, Operating Environment, and Software. To gain maximum benefit from new technologies, it is important to consider all of them when making a decision. One of the biggest challenges this creates is that all three elements rarely emerge simultaneously, and customers can find themselves hindered by past choices.
A real-world example would be a Dell, Intel, and VMware customer planning to upgrade their existing infrastructure.
As the article below notes, vSAN 8.0 with Express Storage Architecture (ESA) represents “A revolutionary release that will deliver performance and efficiency enhancements to meet customers’ business needs of today and tomorrow!” “vSAN ESA will unlock the capabilities of modern hardware by adding optimization for high-performance, NVMe-based TLC flash devices with vSAN, building off vSAN’s Original Storage Architecture (vSAN OSA). vSAN was initially designed to deliver highly performant storage with SATA or SAS devices, the most common storage media at the time. vSAN 8 will give our customers the freedom of choice to decide which of the two existing architectures (vSAN OSA or vSAN ESA) to leverage to best suit their needs.”
The introduction of the next-generation PowerEdge Servers, such as the PowerEdge R760, brings exciting opportunities for customers to enhance their current and future workloads by utilizing the latest vSAN storage architecture. To fully leverage the performance benefits of this new storage architecture, customers can take advantage of the VMware certified hardware configurations for vSAN ESA on Dell vSAN Ready Nodes.
It's important to note that VMware vSAN ESA requires a different set of drives compared to the OSA hardware. With the release of vSAN 8.0, customers are faced with a decision. They likely have an existing infrastructure based on the vSAN OSA architecture running on vSAN 7.0U3. Now, they need to consider the advantages and disadvantages of sticking with the OSA architecture or upgrading to new hardware to unleash the performance of new ESA architecture. The ESA architecture serves as an optional and alternative storage architecture for vSAN software and hardware, offering customers a familiar yet upgraded solution. This choice allows customers to tailor their storage architecture to meet their specific needs and preferences.
There are links at the top of this page detailing recent testing by Intel and Dell on the PowerEdge R760 with vSAN. All tests were conducted using VMware’s HCIBench tool, which VMware describes as “an automation wrapper around the popular and proven open-source benchmark tools: Vdbench and FIO that make it easier to automate testing across an HCI cluster.”
All 4th generation Intel® Xeon® testing was conducted in Dell Labs by Engineers from Intel supported by Engineers from Dell. All testing on 1st generation Intel® Xeon® and 2nd generation Intel® Xeon® was conducted in Intel Labs by Engineers from Intel. The two tests were conducted between November 2022 and March 2023. Solidigm provided all NVMe drives used in these tests.
R760 vSAN 8.0 OSA vs. R640 vSAN 7.0U3 OSA
In the first paper, we configured HCIBench for Vdbench. We compared the performance of a 4 node cluster of PowerEdge R760’s with 4th generation Intel® Xeon® Platinum Processors using vSAN 8.0 (OSA) to a 4 node cluster of PowerEdge R640’s with 1st generation Intel® Xeon® Platinum Processors and a 4 node cluster of PowerEdge R640’s with 2nd generation Intel® Xeon® Platinum Processors with both configurations using vSAN7.0U3. All configurations used an “all flash” storage configuration using components certified and available for that server. The 14th Generation Dell servers were also configured with 2x10Gb/s Networking cards, which were common then. The R760 systems are the first generation of Dell Servers with the PCIe bandwidth necessary to support the OCP 3.0 2x100Gb/s Ethernet Networking cards used in the test. The Intel network cards that were chosen for the R760 also support ROCE v.2 (RDMA Over Converged Ethernet), which was enabled for this test. ROCE v.2 was not available in the NICs used in the prior generation servers. The R640 delivers comparable performance to the R740 and was chosen only for hardware availability reasons.
R760 vSAN 8.0 ESA vs. R640 vSAN 7.0U3 OSA
In the second paper, we configured HCIBench for FIO. We compared the performance of a 4 node cluster of PowerEdge R760’s with 4th generation Intel® Xeon® Platinum Processors using vSAN 8.0 (ESA) to a 4 node cluster of PowerEdge R640’s with 1st generation Intel® Xeon® Platinum Processors and a 4 node cluster of PowerEdge R640’s with 2nd generation Intel® Xeon® Platinum Processors both configurations using vSAN7.0U3. The R640 delivers comparable performance to the R740 and was chosen only for hardware availability reasons.
Vdbench and FIO test throughput (reported in IOPS) and storage latency (reported in milliseconds), but the results are not directly comparable. What is comparable are the ratios of performance gain. After conducting the initial testing with Vdbench to create a baseline, the team moved to FIO for the greater control it provides over tuning parameters. While this would affect performance, it would not be expected to affect the ratios because all systems in each test used a consistent approach for that test.
The 4th generation Intel® Xeon® processors used in these two tests were different. In the first set of tests, the 3rd generation Intel® Xeon® Platinum 8458 PP was used, while in the second test, the 4th generation Intel® Xeon® 8460Y+ was used. This was due to hardware constraints at the time of the test but is not expected to affect performance dramatically. This observation is offered based on the following key differences:
Test 1 Results
Vdbench Test Parameters: 8 K block size, 70% reads, 100% random.
Measured in IO per second (IOPS)
Measured in milliseconds
As these graphs show, vSAN performance in an OSA environment using the new R760 with 4th generation Intel® Xeon® Platinum Processors is up to 1.5x* faster than the two previous generations with up to 1.6x lower latency*. These performance increases were likely driven by the increase in network performance (100 Gb/s Ethernet vs. 10 Gb/s Ethernet). And the generational performance improvements of processors and the underlying NVMe drives benefit from the higher PCIe throughput available in the R760.
Test 2 Results
FIO Test Parameters: 8 K block size, 70% reads, 100% random.
Measured in IO per second (IOPS)
Measured in milliseconds
These graphs show that vSAN performance in an ESA environment using the new R760 with 4th generation Intel® Xeon® Platinum Processors is over 6x faster* than the two previous generations and delivers up to 4.9x lower latency*. With similar underlying hardware as the previous test, this performance increase is primarily a function of the new ESA architecture running on the latest generation Servers.
How to move from OSA to ESA
With higher performance and lower latency, the clear choice would be for customers to move to the vSAN 8.0 ESA architecture using the latest Dell PowerEdge Servers with 4th generation Intel® Xeon® Processors. Still, the question is, “How?”.
According to VMware[i], customers have three options:
- Deploy a new cluster and migrate workloads using vMotion and Storage vMotion.
- Convert existing OSA clusters to ESA by evacuating the cluster, upgrading the hardware, and redeploying it as an ESA solution.
- Perform a rolling cluster migration from OSA to a new cluster.
While the steps necessary for each of these options are different, they all use the same key process: “migrate workloads using vMotion and Storage vMotion.”
Option 1 – Pros and Cons
The choice of option 1 involves deploying new servers into a new cluster and, as it grows, migrate existing virtual machines and storage images to the new cluster.
- Requires the fewest steps
- It does not place any existing data at risk since it can be maintained in the existing cluster until it is ready to move.
- Performance and availability of the environment are affected only during the vMotion/Storage Motion activities.
- This option also provides the additional performance benefits of the new 4th generation Intel® Xeon® Processors and Dell PowerEdge Servers.
- The “Enhanced vMotion Compatibility” (EVC)[ii] feature of ESXi is designed to enable workloads to be live migrated between different generations of processors to ensure uptime for the workload
- It requires the purchase of new hardware; however, this effect can be minimized by implementing this change as part of existing growth plans.
Option 2 – Pros and Cons
The choice of option 2 involves evacuating the existing cluster, upgrading the hardware (storage and network), and redeploying the existing servers into a new cluster. Once the hardware transition is complete, the final step would be to migrate the previously moved virtual machines and storage images to this new cluster.
- Some budget savings may be obtained due to reduced hardware replacement
- This approach may be suitable if existing hardware is certified for ESA[iii]. Details on ESA hardware requirements can be found at the link in this document’s end notes.
- This approach requires that all nodes be reconfigured with NVMe drives. If the current environment uses a spinning disk with SSD as the cache layer, it can be expensive to purchase new drives, reprovision the hardware, and require many hours of work to effect the transition. Note, even for existing clusters that use all NVMe configurations, they would be using older technology drives that cannot deliver the same performance levels as the latest generation of NVMe. Depending on the choices made when the original hardware was purchased, this option may not exist. For example, this option is not available if the existing systems do not have the space and connections necessary to host the required number of NVMe drives.
- This option also adds additional time to the process as it involves first using vMotion/Storage Motion to vacate the cluster and then requires their reuse to repopulate the cluster.
- This option requires that sufficient capacity is available in other clusters to accommodate 100% of the capacity of the cluster being redeployed.
- This approach may require distributing virtual machines and storage images to multiple clusters to obtain the capacity needed. In this case, it adds additional complexity to the migration as the human resources who manage the environment will need to determine how to rebalance all these environments.
Option 3 – Pros and Cons
The choice of option 3 involves selectively removing servers from the existing cluster, allowing time for the vSAN environment to rebuild, downing the selected servers, upgrading the hardware (storage and network), and redeploying the existing servers into a new cluster. As this new cluster grows, the final stage would be migrating existing virtual machines and storage images to this new cluster.
- Some budget savings may be obtained due to reduced hardware replacement
- This approach may be suitable if existing hardware is certified for ESAv. Details on ESA hardware requirements can be found at the link in this document’s end notes.
- The same as above, this approach requires that all nodes be reconfigured with NVMe drives. If the current environment uses a spinning disk with SSD as the cache layer, it can be expensive to purchase new drives, reprovision the hardware, and require many hours of work to effect the transition. Note, even for existing clusters that use all NVMe configurations, they would be using older technology drives that cannot deliver the same performance levels as the latest generation of NVMe. Depending on the choices made when the original hardware was purchased, this option may not exist. For example, this option is unavailable if the existing systems do not have the space and connections necessary to host the required NVMe drives.
- This option requires less time in each step to effect the transition but may require more time. This approach also requires appropriate planning to allow the old vSAN time to redistribute the data.
- This approach also introduces additional risk due to the high level of coordination required between resources to ensure that the correct server is removed from the cluster.
IT professionals’ primary responsibilities are reducing downtime, increasing performance and scalability, and optimizing infrastructure. As technology continues to evolve, engineers at Dell, Intel, and VMware are focused on optimizing new solutions to deliver greater value to customers. Deploying new technologies into old environments reduces or sometimes eliminates this value. Combining Dell PowerEdge Servers with 4th generation Intel® Xeon® Processors and the latest VMware hypervisor/vSAN software can dramatically improve performance, reduce latency, and significantly increase the business benefit. With storage devices forming a large portion of the cost of a server, reconfiguring existing hardware to optimize the capabilities of vSAN8.0 ESA requires a significant capital investment. Yet it will still not deliver maximum performance due to the reduced performance of legacy NVMe and Servers. In addition, this approach significantly increases the workload on existing IT staff. Based on this, Dell and Intel recommend that customers implement Option 1 to Modernize their IT infrastructure, reduce risk, and maximize business benefits.
*All performance claims noted in this document were based on measurements conducted in accordance with published standards for HCIBench. Performance varies by use, configuration, and other factors. Performance results are based on testing conducted between November 2022 and March 2023.
Test Report: PowerEdge R760 with Elasticsearch
Wed, 02 Aug 2023 17:04:20 -0000|
Read Time: 0 minutes
The introduction of new server technologies allows customers to use the new functionality to deploy solutions. It can also provide an opportunity for them to review their current infrastructure to see whether the new technology can increase efficiency. With this in mind, Dell Technologies recently conducted performance testing of an Elasticsearch solution on the new Dell PowerEdge R760 and compared the results to the same solution running on the previous generation R750 to determine whether customers could benefit from a transition. All testing was conducted in Dell Labs by Intel and Dell engineers in April 2023.
Choosing which CPU to deploy with an advanced solution like Elasticsearch can be challenging. A customer looking for maximum performance would typically start with the most expensive CPU available, while another customer might make a choice that offers a tradeoff between performance and price. For the purposes of this test, we decided to benchmark the new R760 with a lower cost processor so that we could compare the results to a previous generation R750 server using the top end Intel® Xeon® Platinum 8380 CPU.
An Elasticsearch solution includes multiple key components that combine into the “Elastic Stack”.
- Elasticsearch: RESTful, JSON-based search engine
- Logstash: Log ingestion pipeline
- Kibana: Flexible visualization tool
- Beats: Lightweight, single purpose data shippers
To conduct the testing, we deployed Rally 2.7.1 as the benchmarking tool. Using an OpenShift Kubernetes cluster, each server was configured to create an Elasticsearch cluster with eight instances (containers). Next, each system ran 10 cycles of searches to establish a “steady-state” flow of data as an indexing test. The performance of each system was measured by capturing the mean throughput of the bulk index (doc/s) and the search query latency (ms).
The benchmark simulated storing log files (application, http_logs, and system logs) and users who use Kibana to run analytics on this data. The test executes indexing and querying concurrently. Data replication was enabled, and software configuration was the same on both platforms.
The average CPU utilization during the test was 80%.
Logging - server log data
The logging-indexing-querying workload generates multiple server logs before the test. The benchmark executes indexing and querying concurrently. Queries were issued until indexing was complete.
We used the following log types:
- Nginx access and error logs
- Apache access and error logs
- Mysql slowlog and error logs
- Kafka logs
- Redis app logs
- System syslog logs
- System auth logs
Who uses it? This data is typically produced by web services and could be used to validate HTTP responses, track web traffic, and monitor databases and system logs.
Hardware configurations tested
Note: The Dell Ent NVMe P5600 MU U.2 3.2TB Drives are manufactured by Solidigm.
Recommended customer pricing for the CPUs used in the tested configurations
- R750 - Intel Xeon Platinum 8380 - $9,359 - reviewed on June 6, 2023
- R760 - Intel Xeon Platinum 8460Y+ - $5,558 – reviewed on June 6, 2023
The following results represent the mean of 10 separate test runs.
Indexing Throughput (docs/s)
Indexing throughput indicates how many documents (log lines) that Elasticsearch can index per second.
Note: Higher is better
Latency improvement indicates how much faster search query results return.
Note: Higher is better
Power consumption and calculations
Choosing the right combination of server and processor can increase performance, reduce latency, and reduce cost. As this testing demonstrated, the Dell PowerEdge R760 with 4th Generation Intel Xeon Platinum 8460Y CPUs was up to 1.24x faster than the Dell PowerEdge R750 with 3rd Generation Intel Xeon Platinum 8380 CPUs.
An important element to consider is that the R760 was able to accomplish all of this using CPUs with a recommended customer price that was more than 40% less, thus reducing capital expense. The testing further demonstrated that customers can reduce operating costs by implementing new technologies that can deliver more work per watt.
Powering your Elasticsearch Solution with Dell PowerEdge Servers and Intel® 4th Generation Xeon® Processors
Wed, 02 Aug 2023 16:49:52 -0000|
Read Time: 0 minutes
This joint paper outlines a brief discussion on the key hardware considerations when configuring a successful deployment and recommends configurations based on Dell 16th Generation PowerEdge servers.
Elasticsearch is a distributed, open-source search and analytics engine for all types of data including textual, numerical, geospatial, structured, and unstructured. This proposal contains recommended configurations for Elasticsearch clusters on the Kubernetes platform (Red Hat OpenShift Container Platform with Elastic Cloud on Kubernetes (ECK) operator) running on 16th Generation Dell PowerEdge servers with 4th Generation Intel Xeon Scalable processors.
- Faster and scalable performance - Elasticsearch running on the latest Dell PowerEdge servers is built on high-performing Intel architecture and configured with 4th Generation Intel Xeon Scalable processors. Indexing is faster and capacity can scale with your needs.
- Better energy and data center space efficient - Running Elasticsearch on the latest generation of PowerEdge servers can save energy and power an even more effective search experience. Moving to the latest generation of PowerEdge servers based on Intel Xeon can help reduce emissions, protect our environment, and reduce operating costs.
- Reduced search times and increased number of concurrent searches - As data grows and needs to be accessed across the cluster, data-access response times are critical, especially for real-time analytics applications. Elasticsearch running on the latest PowerEdge servers is built on high-performing Intel architecture, including Intel Ethernet network controllers, adapters, and accessories to enable agility in the data center and support higher throughput with low latency response times.
- Index more data - Elasticsearch can handle and store more data by increasing DRAM capacity and using PCIe Gen 4 NVMe disk drives. PowerEdge R760 servers are ideally suited to this requirement with memory capacity of up to 8TB and storage expansion of up to 24 high performance NVMe drives.
- Easy and secure installation - The Elastic Cloud on Kubernetes (ECK) operator is an official Elasticsearch operator certified on the Red Hat OpenShift Container Platform, providing ease of deployment, management and operation of Elasticsearch, Kibana, APM Server, Beats, and Enterprise Search on OpenShift clusters. Elasticsearch clusters are secure by default (with enabled encryption and strong passwords).
- Multi Data Tiers - As data grows, costs do not also need to increase. With multiple tiers of data, you can extend capacity and drive storage costs down without performance loss. Each capacity layer can be scaled independently by using larger drives or mode nodes (or both), depending on your needs.
Elasticsearch cluster on Kubernetes (Red Hat OpenShift Kubernetes) platform
OpenShift Control Plane Master Nodes
Elasticsearch Master / Ingest / Hot tier data nodes
Dell PowerEdge R760 chassis with up to 24x2.5” NVMe Direct Drives
2 x Intel Xeon Gold 6430 processors
2 x Intel Xeon Platinum 8460Y+ processors
128GB (16x 8GB DDR5-4400)
512 GB (16 x 32GB DDR5-4800)
Dell BOSS-S2 with 2x 240GB or 2x 480GB M.2 SATA SSD (RAID1)
Not needed for all-NVMe configurations
1x 1.6TB Enterprise NVMe Mixed-Use AG Drive U.2 Gen4
2x (up to 24x) 3.2TB Enterprise NVMe Mixed-Use AG Drive U.2 Gen4
Intel E810-CQDA2 for OCP3 (dual-port 100GbE)
Contact your Dell account team for a customized quote 1-877-289-3355.
Read the doc: What is Elasticsearch?
Read the doc: Data tiers | Elasticsearch Guide