![Banner image](https://cdn-prod.scdn6.secure.raxcdn.com/static/media/f36f9de5-dd3a-4de1-af99-9f39e27016e8.jpg?_cb=1696260304.895245)
The case for upgrading your servers to Dell PowerEdge R7625 servers powered by 4th Gen AMD EPYC processors
Read the Report See the science View the InfographicMon, 25 Sep 2023 15:51:00 -0000
|Read Time: 0 minutes
Principled Technologies examined the performance improvements and cost savings associated with upgrading to the 16th Generation Dell PowerEdge R7625 for machine learning algorithms
Overview
Recent years have seen a dramatic increase in the amount of data organizations store and analyze. Between 2010 and 2020, the amount of data people and organizations created, copied, consumed, and stored increased from 2 zettabytes to 64 zettabytes.[i] Machine learning (ML) tools can help companies put this data to work by analyzing it and extracting key insights, enabling more informed, data-driven business decisions. To meet this need, ML tools have become more powerful—but these workloads also put more demand on data centers.
We used the HiBench benchmark to understand the benefits of upgrading from the 15G Dell™ PowerEdge™ R7525 server to the 16G Dell PowerEdge R7625 server powered by Broadcom® network interface cards (NICs) and PERC 11 storage controllers. Both servers feature two AMD EPYC™ 64-core processors for a direct core-to-core generational comparison. We measured the throughput and time to complete k-means clustering and Bayesian classification workloads using both servers. We found the latest-generation PowerEdge R7625 offered better performance for the same amount of cores running both workloads. This means that organizations that upgrade to the latest-generation PowerEdge R7625 servers could process ML workloads faster, allowing them to update their models with new data more frequently for more timely insights. Plus, organizations that choose PowerEdge R7625 servers could save money by reducing the number of servers required to do the same amount of work as PowerEdge R7525 servers, which could reduce energy/cooling costs as well as licensing costs—up to $10,178.99 per year per consolidated server on Red Hat OpenShift licensing.
The challenges of data proliferation and compute‑intensive workloads
The rise of the Internet of Things (IoT), cloud computing, and smartphones have made it possible for businesses to harvest data from a wide range of sources and utilize it to improve their operations. Retailers can use data to track customer behavior and make their marketing more effective; manufacturers can use data to make their processes more efficient; and financial institutions can use data to detect fraud or predict market changes. As businesses gain access to new sources of data and use new technologies to analyze that data, the demand for more powerful servers will continue to grow.
Machine learning and artificial intelligence (AI) workloads have enormous potential to improve business operations, but as they gain popularity, they consume increasing amounts of processing power.[ii] According to OpenAI, developers of ChatGPT, the computing power of their AI system doubles every 3.4 month.[iii] As the ML applications organizations use become more demanding, they will need more powerful servers in their data centers as well as efficient data analysis tools in the ML pipeline. Among those data analysis tools is Apache Spark™.
Apache Spark is an open-source computing framework that converts very large data sets into smaller blocks of data for the purpose of applying machine learning algorithms and analyzing the data quickly using a distributed network of devices. For algorithms that operate on chunks of data, Spark is effective because it farms the data out to servers in the cluster, the servers process the chunks of data, and then Spark combines them for the final result. One of the main advantages of using Spark is that it can split data sets into chunks that fit in memory (when the entire data set might not) and operate with data that is entirely in memory—it doesn’t need to write to disk, which saves time. Spark is scalable: users can expand the size of their data set by adding more nodes. According to Databricks®, Spark can process “multiple petabytes of data on clusters of over 8,000 nodes,” and Spark supports a variety of data sources, including Hadoop HDFS. [iv]
We focused on two Apache Spark capabilities—k-means clustering and Bayesian classification—in our examination of the value of upgrading to the 16G Dell EMC PowerEdge R7625 server powered by 4th Gen AMD EPYC processors along with Broadcom NICs and PERC 11 storage controllers. Using these workloads, we measured the throughput and speed of the servers. A server with better throughput and speed can process more data, handle more concurrent users, handle heavier workloads, and improve response times.
About Dell EMC PowerEdge R7625 servers
The Dell EMC PowerEdge R7625 server we tested features two AMD EPYC™ 9554 processors that each contain 64 cores and a Broadcom BCM5720 NIC. According to Dell, “the PowerEdge R7625 is a highly scalable two-socket, 2U rack server packed with 50 percent more cores and up to 6 GPUs in a package that combines powerful performance and flexible configuration.”[v] According to Dell, the R7625 features:
- “Up to two 4th Gen AMD EPYC processors with up to 96 cores
- Available with either liquid or air-cooled configurations
- Low-latency storage options”[vi]
How we tested
We tested the following configurations:
- One 16G Dell PowerEdge R7625 server powered by 4th Gen AMD EPYC 64-core processors along with Broadcom NICs and PERC 11 storage controllers
- One 15G Dell PowerEdge R7525 server powered by 3rd Gen AMD EPYC 64-core processors along with Broadcom NICs and PERC 10 storage controllers
We configured both systems at maximum RDIMM capacity. The R7625 has a higher maximum capacity at 3TB and higher speed RAM at 4800 MT/s than the R7525 at 2TB and 3200MT/s, which is a useful upgrade for processing memory-intensive Spark workloads. We used Red Hat® OpenShift® virtualization. OpenShift is an open-source, Kubernetes-based container platform that offers a set of tools to manage, scale, and deploy containerized applications. For our deployment of OpenShift, we used a single-node deployment mode which is a new feature that is meant for proof of concept type environments. A typical OpenShift deployment uses three or more servers in a clustered configuration.
On each system, we created 10 OpenShift VMs with 24 cores, 96GB RAM, and one OpenShift VM with 12 cores, 32GB RAM, and one 30GB storage volume. We used this network for Spark cluster communications and Spark testing. We used Red Hat Enterprise Linux® 8 for the OS and installed Java™ 1.8.0, Python2®, and Apache Maven® 3.5.4; Apache Spark 3.0.3 with the Apache Hadoop 3.2 libraries; Apache Hadoop 3.2.4 for its HDFS capabilities; and the HiBench testing framework, version 7.1.1 with updates up to June 12, 2023 from its GitHub repository. We configured the 12-core VM as the Spark primary, and as the Hadoop manager for HDFS. We configured the remaining 10 VMs as Spark workers and Hadoop data nodes for HDFS. We used the storage volume for both the OS and for HDFS. We ran HiBench Bayes and k-means workloads from the Spark primary VM. Below is a table showing a summary of the system configurations we used in testing. For more details about our testing and configurations, read the science behind the report.
Table 1: System configurations we used in testing. Source: Principled Technologies.
Server configuration information | Dell PowerEdge R7625 | Dell PowerEdge R7525 |
Hardware |
| |
Processor | AMD EPYC 9554 – 64 cores, 3.10 GHz | AMD EPYC 7763 – 64 cores, 2.45 GHz |
Storage controller | PERC H755 Front, 8GB cache | PERC H745 Front, 4GB cache |
Total memory in system (GB) | 3,072 | 2,048 |
Disks | 4x Dell Ent NVMe v2 AGN MU U.2 6.4TB, 6,144GB, NVMe v2, PCle, SSD | 4x Dell Ent NVMe v2 AGN MU U.2 6.4TB, 6,144GB, NVMe v2, PCle,SSD |
Software | ||
VM software | Spark 3.03 Hadoop 3.2.4 Open JDK 1.8.0_372 | |
Operating system name and version | Red Hat Enterprise Linux CoreOS 4.12 Linux kernel 4.18.0-372.49.1.el8_6.x86_64 | |
Virtualization | OpenShift Virtualization 4.12 | |
VM operating system name and version | Red Hat Enterprise Linux 8.8 Linux kernel 4.18.0-477.13.1.el8_8.x86_64 |
About 4th Gen AMD EPYC 9554 processors
According to AMD, EPYC 9554 processors deliver fast performance “for cloud, enterprise, and HPC workloads- helping accelerate your business.”[vii] EPYC processors include AMD Infinity Guard, which per AMD is “a set of layered, cutting-edge security features that help you protect sensitive data and avoid the costly downtime cause by security breaches.”[viii]
The EPYC 9554 has support for AVX512 processor extensions that speed up AI inference, including the use of the BFloat 16 data type (AVX512_BF16), and Vector Neural Network Instructions (AVX512_VNNI). In contrast, the EPYC 7763 processor has no support for AVX512 instructions.
In addition to performance and security features, AMD claims their processors are energy-efficient, which can reduce energy costs and “minimize environmental impacts from data center operations while advancing your company’s sustainability objectives.”[ix]
For more information about 4th Gen AMD EPYC processors visit: https://www.amd.com/en/processors/epyc-server-cpu-family.
About the HiBench benchmark suite
According to its GitHub repository, the HiBench benchmark suite “is a big data benchmark suite that helps evaluate different big data frameworks in terms of speed, throughput and system resource utilizations.”[x] The HiBench benchmark suite offers performance testing for 29 different types of workloads, including the machine learning algorithms associated with Bayesian Classification (Bayes) and k-means clustering.
Our results
K-means clustering
For large data sets, it isn’t possible for a human to analyze the data as efficiently or effectively as a machine learning algorithm can. K-means clustering is a machine learning algorithm that aims to group similar or dissimilar data points together in clusters. By finding similarities between data points that wouldn’t be obvious with other means of analysis, k-means clustering can unlock valuable insights into individual data points, whether they are about the customers of a business, the manufacturing processes of a factory, or some other aspect of a business. These insights could help an e-commerce company offer promotions to similar types of customers or help an insurance company detect anomalies or fraud. Using the latest generation of server technology has the potential to help businesses unlock these actionable data insights faster. Tools like RapidMiner®, ELKI, Orange, Weka®, and MATLAB™ rely on k-means clustering for some of types of calculations.
To better understand how upgrading server technology might benefit organizations that use k-means clustering to analyze their data, we used the HiBench benchmark suite to compare the k-means performance in terms of throughput (megabytes per second) and speed (seconds). As Figures 1 and 2 show, the new Dell PowerEdge R7625 server outperformed the previous-generation server in both measurements. The latest-generation server had 70.0 percent higher throughput and completed the k-means workload 41.2 percent faster than the previous-generation device.
These results suggest that organizations that frequently use k-means clustering to gain insights might benefit from upgrading their older servers. For an e-commerce company that provides personalized product recommendations to millions of users based on data, better throughput and faster k-means speed could allow them to tailor their recommendations more quickly. Faster throughput and speed could allow the e-commerce company to update their clustering model more frequently so that it adapts to changing customer behavior in real time. These improvements could lead to more customer engagement and higher sales.
Figure 1: A comparison of the k-means throughput of the two servers in megabytes per second. Higher is better. Source: Principled Technologies.
Figure 2: A comparison of the times, in seconds, that the two servers took to complete the test k-means workload. Lower is better. Source: Principled Technologies.
Bayesian classification
Bayesian classification (or Bayesian inference) is a method of estimating the probability of an outcome and calculating the uncertainty around this probability using historical data. By analyzing prior outcomes, Bayesian machine learning can give organizations a statistical probability for a future outcome. A retailer may want to know the probability of a customer making a purchase after receiving a coupon code, for example. More advanced applications of Bayesian inference have helped scientists develop new drugs and assign probability to the accuracy of diagnostic tests.[xi],[xii] Being able to quickly analyze data sets for predictions about the future can be a powerful tool for businesses and organizations.
To evaluate the Bayesian analysis performance of the servers, we used the HiBench benchmark suite to compare the total throughput, measured in megabytes per second, and the speed of analysis, in seconds. As Figure 3 shows, the 16G Dell PowerEdge R7625 achieved 19.5 percent more throughput than the previous-generation server. As Figure 4 shows, the new server was 16.3 percent faster at completing the Bayesian classification workload than the previous-generation server we compared it to.
These results indicate just how much organizations that use Bayesian machine learning to make probabilistic calculations might benefit from upgrading their aging servers. For a financial services company that uses Bayesian analysis to make investment decisions and assess risk, higher throughput and speed could allow them to handle larger data sets and run more complex models to make more accurate, real-time decisions. Alternatively, a healthcare system that uses Bayesian models for diagnosis and treatment could update patient models faster and more frequently, leading to more accurate diagnosis and better health outcomes for patients.
Figure 3: A comparison of the Bayes throughput of the two servers in megabytes per second. Higher is better. Source: Principled Technologies.
Figure 4: A comparison of the times, in seconds, that the two servers took to complete the test Bayes workload. Lower is better. Source: Principled Technologies.
Performance and value – How these results can impact the bottom line
With any decision to upgrade a server environment, companies want to know that their upfront investment in new technology provides opportunities to save money further down the road. New technologies come at a price, but improvements in performance and efficiency can pay off in the long run.
Organizations can potentially save money by consolidating older servers with higher-performing, newer servers that each do more work. In our testing, a single Dell PowerEdge R7625 outperformed the Dell PowerEdge R7525 by up to 70 percent, completing 1.7 times as much k-means work as a single PowerEdge R7525. This means that two PowerEdge R7625 servers could process 3.4 times as much k-means work as one PowerEdge R7525 server. In other words, two PowerEdge R7625 servers can process the same amount of work as three PowerEdge R7525 servers with an additional 40 percent headroom. Thus, an organization that upgrades the servers in their data centers could likely reduce the total number of servers and still process the same workloads.
For each server a company can consolidate onto new gear, they can reduce their licensing cost for Red Hat OpenShift Platform Plus licensing costs for a standard 1-year subscription by $10,178.99 or by $27,820.99 for a standard 3-year subscription.[xiii],[xiv] These savings don’t even take into account premium subscriptions or additional support add-ons, which would further reduce annual licensing and support costs. By reducing server counts, companies could also find savings in the reduction of cooling costs, power costs, and data center footprints. As the number of servers in a data center scales, so too do the savings associated with upgrading to the latest-generation PowerEdge R7625 servers.
About Broadcom Gigabit Ethernet BCM5720 Controller
The Dell PowerEdge servers we tested feature Broadcom Gigabit Ethernet BCM5720 controllers. According to Broadcom, its 1G Ethernet Controllers are “the ideal solution for multicore servers, delivering full line-rate throughput across all ports.”[xv]
The BCM5720 Dual-Port 1GBASE-T PCle 2.1 Ethernet Controller is a 13th generation 10/100/1000BASE-T and 10/100/1000BASE-X Ethernet LAN controller solution. The host interface supports a separate PCle function for each LAN interface and the controller includes I/O Virtualization (IOV) features such as 17 receive and 16 transmit queues, and 17 MSI-X vectors with flexible vector-to-queue association. These IOV features enable the BCM5720 to support the VMware® NetQueue and Microsoft VMQ technologies.[xvi]
Broadcom also states that this controller has “a comprehensive set of hardware features that the system may use to implement IEEE 1588 or IEEE 802.1AS-based time synchronization. These hardware features include a high-precision clock, timestamp registers for receive/transmit packets, and programmable trigger inputs and watchdog outputs.”[xvii]
Learn more at https://www.broadcom.com/products/ethernet-connectivity/network-adapters/bcm5720-1gbase-t-ic.
About Broadcom PERC 11 PERC H755N controllers
The PERC11 series of adapters presents a diverse range of notable features. It ensures dependable, high-performance, and fault-tolerant management of the disk subsystem. These adapters possess extensive RAID control capabilities, offering support for multiple RAID levels, such as 0, 1, 5, 6, 10, 50, and 60.[xviii] This facilitates efficient data safegaurding and redundancy mechanisms within the system.
Regarding compatibility, the PERC11 adapters conform to the Serial Attached SCSI (SAS) 3.0 standard, which facilitates a maximum data throughput of 12 Gb/s. This adherence ensures streamlined data transfer and seamless operations within the storage environment. Furthermore, the adapters boast extensive compatibility with a wide array of storage devices. They seamlessly integrate with Dell-qualified Serial Attached SCSI (SAS) and SATA hard drives, solid-state drives (SSDs), and PCle SSDs (NVMe). This versatility empowers users to leverage diverse storage options that align with their specific requirements and preferences.
Conclusion
As data proliferates and the sizes of databases grow, the potential to unlock valuable insights from them becomes increasingly dependent on fast architectures that can handle compute-intensive machine learning workloads such as k-means clustering and Bayesian inference. By upgrading to the latest servers, organizations can scale their processing power to meet the growing demands of their databases.
Larger databases and more powerful algorithms have the potential to give organizations a competitive edge. Faster servers can improve the accuracy of data-driven decisions by allowing organizations to use more complex algorithms and update ML models more frequently. To consider just two examples, improved performance could allow an e-commerce company to make better recommendations to customers and a financial services company to assess risks more accurately.
When we compared the machine learning performance of a 16G Dell PowerEdge R7625 server powered by 4th Gen AMD EPYC 64-core processors with Broadcom NICs and PERC 11 storage controllers to a previous-generation PowerEdge server, we found performance enhancements in terms of throughput and speed, whether running k-means clustering or Bayesian workloads. These findings suggest that organizations that rely on machine learning algorithms might gain performance advantages by upgrading to the latest generation of these Dell servers.
This project was commissioned by Dell Technologies.
September 2023
Principled Technologies is a registered trademark of Principled Technologies, Inc.
All other product names are the trademarks of their respective owners.
[i] Petroc Taylor, “Volume of data/information created, captured, copied, and consumed worldwide from 2010 to 2020, with forecasts from 2021 to 2025,” accessed June 12, 2023, https://www.statista.com/statistics/871513/worldwide-data-created/.
[ii] Andreja Velimirovic, “Why Density per Rack is Going Up,” accessed June 12, 2023, https://phoenixnap.com/blog/rack-density-increasing.
[iii] The Science of Machine Learning, “Exponential Growth,” accessed June 12, 2023, https://www.ml-science.com/exponential-growth.
[iv] Databricks, “Apache Spark.”
[v] Dell, “PowerEdge R7625 Rack Server,” accessed June 11, 2023, https://www.dell.com/en-us/shop/dellpoweredge-servers/poweredge-r7625-rack-server/spd/poweredge-r7625/pe_r7625_15972_vi_vp.
[vi] Dell, “PowerEdge R7625 Rack Server.”
[vii] AMD, “AMD EPYC Processors,” accessed June 27, 2023, https://www.amd.com/en/processors/epyc-server-cpu-family.
[viii] AMD, “AMD EPYC Processors.”
[ix] AMD, “AMD EPYC Processors.”
[x] GitHub, “HiBench Suite,” accessed June 27, 2023, https://github.com/Intel-bigdata/HiBench.
[xi] Christopher J. Yarnell, John T. Granton, and George Tomlinson, “Bayesian Analysis in Critical Care Medicine,” accessed June 27, 2023, https://www.atsjournals.org/doi/10.1164/rccm.201910-2019ED.
[xii] Sandeep K. Gupta, “Use of Bayesian statistics in drug development: Advantages and challenges,” accessed June 16, 2023, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3657986/.
[xiii] Insight, “Red Hat OpenShift Platform Plus - standard subscription (1 year) - 1-2 sockets,” accessed July 16, 2023, https://www.insight.com/en_US/shop/product/MW01624/red%20hat%20software/MW01624/Red-[…]nShift-Platform-Plus-standard-subscription-1-year-12-sockets/.
[xiv] Insight, “Red Hat OpenShift Platform Plus - standard subscription (3 years) - 1-2 sockets,” accessed July 26, 2023, https://www.insight.com/en_US/shop/product/MW01624F3/red%20hat%20software/MW01624F3/[…]Shift-Platform-Plus-standard-subscription-3-years-12-sockets/.
[xv] Broadcom, “BCM5720 - Dual-Port 1GBASE-T,” accessed June 8. 2023, https://www.broadcom.com/products/ethernet-connectivity/network-adapters/bcm5720-1gbase-t-ic.
[xvi] Broadcom, ”BCM5720 - Dual-Port 1GBASE-T.”
[xvii] Broadcom, ”BCM5720 - Dual-Port 1GBASE-T.”
[xviii] Dell, “Dell PowerEdge RAID Controller 11 User’s Guide PERC H755, H750, H355, and H350 Controller Series—Dell Technologies PowerEdge RAID Controller 11,” accessed June 28, 2023, https://www.dell.com/support/manuals/en-us/poweredge-r6525/perc11_ug/dell-technologies-poweredge-raid-controller-11?.
Related Documents
![Post thumbnail](https://cdn-prod.scdn6.secure.raxcdn.com/static/media/0d718714-0d5f-4754-b622-9f85f5c35351.jpeg?_cb=1711727206.0276408)
Dell PowerEdge R7625 Rack Server & Emulex LPe36002 Host Bus Adapter: 64G Fibre Channel Microsoft SQL Server
Fri, 29 Mar 2024 16:19:02 -0000
|Read Time: 0 minutes
Dell PowerEdge R7625 Rack Server & Emulex LPe36002 Host Bus Adapter
64G Fibre Channel Microsoft SQL Server Performance – NVMe/FC vs. SCSI/FC
Tolly Report #224107
Tolly test report demonstrating that Dell PowerEdge R7625 Rack Server outfitted with the Emulex LPe36002 Host Bus Adapter using NVMe/FC can improve application performance vs older generation SCSI/FC.
Executive Summary
New generation servers can bring higher performance across a range of areas. This is certainly the case with Dell’s 16th-generation server line. Similarly, newer protocols like NVM Express (NVMe) over Fibre Channel (FC) can provide greater throughput and efficiency than older SCSI over FC. Dell is unique in offering an end-to-end NVMe/FC connectivity solution in the mid-range storage marketplace with the PowerStore line.
Dell commissioned Tolly to benchmark the performance of the Broadcom Emulex LPe36002 64G Fibre Channel dual-port host bus adapter (HBA) running in the Dell PowerEdge R7625 Rack Server with AMD EPYC processors by testing using actual database applications rather than simulated I/O microbenchmarks. Testing focused on evaluating the database throughput, latency, and CPU efficiency of accessing Microsoft SQL Server 2019 for Linux systems over older SCSI/FC and newer NVMe/FC. Databases were stored on a Dell PowerStore 9200T storage appliance.
Tests showed significant improvements in transaction throughput, latency reduction, and CPU efficiency. See Figure 1 for a summary of relative improvements.
The Bottom Line | |
Dell PowerEdge R7625 with AMD EPYC processors & Emulex LPe36002 64G HBA using NVMe/FC: | |
1 | Improved database transactions by 38% |
2 | Reduced database stored procedure latency by 35% |
Overview
The goal of this test was to illustrate the performance benefits of using the newer, more-efficient NVMe/FC protocol in lieu of the older, less-efficient SCSI/FC protocol in conjunction with Emulex 64G FC HBAs running under Linux in a Dell PowerEdge R7625 Rack Server. (Dell sells the Emulex 64G FC HBA for the same price as the Emulex 32G FC HBA.)
The test was run using Microsoft SQL Server 2019 for Linux accessing the database via SCSI and then via NVMe.
While low-level component benchmarks are instructive, ultimately system architects are rightly most interested in how network-level improvements can translate into application performance improvements. This benchmarking was done with HammerDB which generates actual user transactions against an actual database. The test was focused on TPROC-C which is the HammerDB, database-oriented implementation of the de facto standard TPC-C online transaction processing benchmark.
Tests showed significant improvements in key benchmarks.
Test Results
Microsoft SQL Server 2019 for Linux
Transaction Processing. The NVMe/FC results were significantly better than the SCSI/FC results. When run over NVMe/FC, 38% more transactions per minute were processed.
CPU Efficiency. The NVMe/FC results were significantly better than the SCSI/FC results. When run over NVMe/FC, the CPU efficiency was improved by 50%.
P95 Stored Procedure Latency. Similarly, the NVMe/FC results were significantly better than the SCSI/FC results. When run over NVMe/FC, the latency was reduced by 35%.
Test Setup & Methodology
The HBA under test used current production drivers that are publicly available. Default settings were used. Details of the test environment and systems under test are found in Tables 1-5. Figure 2 shows a composite test environment.
Database Test
The goal of this test was to benchmark the database transaction performance of each HBA running the HammerDB “TPROC-C” workload which, as noted earlier, is the HammerDB, database version of the Transaction Processing Council’s TPC-C OLTP benchmarked
A Dell PowerEdge R7625 server, powered by AMD EPYC processors, was configured with the HBA under test. The Broadcom Emulex LPe36002 64G HBA connected to a Dell PowerStore 9200T via a Dell Connectrix 64G Fibre Channel switch. The test utilized a single 64G FC port of the Emulex HBA.
The server ran RHEL 8.9. SCSI Device Mapper and NVMe native multipath were enabled for the respective devices. NUMA was set to off and “transparent huge pages” was disabled.
For storage, path selection policy for NVMe native multipath was set to “round-robin". For SCSI Device mapper multipath was set to "queue-length 0”.
This test was run using Microsoft SQL Server 2019 for Linux,
The open source HammerDB test tool was used to populate the database schema and run the workload.
Table 1. HBA Under Test
Vendor | Product Name | Firmware | Driver |
Broadcom | Emulex LPe36002 (64G) (PCIe 4.0) | 14.0.539.26 | 14.0.0.15 |
Table 2. Server Configuration
Vendor/System | Dell PowerEdge R7625 |
CPU | 2 socket AMD EPYC 9374F 32-Core Processor @ 3.8 GHz |
Number of CPUs | 128 logical processors. Profile: Performance, Logical Processors: Enabled, Sub Numa Clustering: Disabled |
Memory (RAM) | 256 GB |
Power Mode
| Performance |
OS | Red Hat Ent. Linux 8.9 (RHEL8) |
Kernel | 4.18.0-425.3.1 |
Table 3. Microsoft Database Configuration
Database | Microsoft SQL Server 2019 for Linux |
Storage | Single volume, XFS |
Dataset Size | 100 GB |
DB Memory Allocation | 10G |
Table 4. Database Test Tool
Vendor | Open Source |
Application | HammerDB 4.7 |
TPROC-C settings | Total # of Warehouses = 1,000 Transactions per user = 1 million Ramp-up time: 2 minutes Run time: 5 minutes |
Table 5. Storage Configuration
Vendor/Device | Dell PowerStore 9200T v3.5 |
Ports | 8 x 32G FC |
Volume Size | 1,024GB volume each for NVMe/FC and SCSI/FC |
Namespace/LUN | 8 x 32G target ports (single namespace) |
Network Fabric | Dell Connectrix 64G FC switch v9.0.1a |
About AMD
For over 50 years, AMD has been at the forefront of driving innovation in high-performance computing, graphics, and visualization technologies. Their products are relied upon by billions of people, leading Fortune 500 businesses, and cutting-edge scientific research institutions worldwide. AMD's mission is to build exceptional products that accelerate next-generation computing experiences and power solutions for the world's most important challenges. Visit http://www.amd.com for more information about AMD.
Broadcom Emulex LPe36002
The Broadcom Emulex LPe36000-series Gen 7 Fibre Channel HBAs are designed for demanding mission-critical workloads and emerging applications. The family of adapters features Silicon Root of Trust security, designed to thwart firmware attacks aimed at enterprises and governments.
Gen 7 64G provides seamless backward compatibility to 32G and 16G networks.
Dell sells the LPe36002 64G HBA for the same price as the 32G model.
About Tolly
The Tolly Group companies have been delivering world-class IT services for over 30 years. Tolly is a leading global provider of third-party validation services for vendors of IT products, components and services.
You can reach the company by E-mail at sales@tolly.com, or by telephone at +1 561.391.5610.
Visit Tolly on the Internet at: http://www.tolly.com
Tolly Terms Of Usage
The Tolly Gro This document is provided, free-of-charge, to help you understand whether a given product, technology, or service merits additional investigation for your particular needs. Any decision to purchase a product must be based on your own assessment of suitability based on your needs. The document should never be used as a substitute for advice from a qualified IT or business professional. This evaluation was focused on illustrating specific features and/or performance of the product(s) and was conducted under controlled, laboratory conditions. Certain tests January have been tailored to reflect performance under ideal conditions; performance January vary under real-world conditions. Users should run tests based on their own real-world scenarios to validate performance for their own networks.
Reasonable efforts were made to ensure the accuracy of the data contained herein but errors and/or oversights can occur. The test/audit documented herein January also rely on various test tools the accuracy of which is beyond our control. Furthermore, the document relies on certain representations by the sponsor that are beyond our control to verify. Among these is that the software/hardware tested is production or production track and is, or will be, available in equivalent or better form to commercial customers. Accordingly, this document is provided "as is," and Tolly Enterprises, LLC (Tolly) gives no warranty, representation or undertaking, whether express or implied, and accepts no legal responsibility, whether direct or indirect, for the accuracy, completeness, usefulness, or suitability of any information contained herein. By reviewing this document, you agree that your use of any information contained herein is at your own risk, and you accept all risks and responsibility for losses, damages, costs, and other consequences resulting directly or indirectly from any information or material available on it. Tolly is not responsible for, and you agree to hold Tolly and its related affiliates harmless from any loss, harm, injury, or damage resulting from or arising out of your use of or reliance on any of the information provided herein.
Tolly makes no claim as to whether any product or company described herein is suitable for investment. You should obtain your own independent professional advice, whether legal, accounting or otherwise, before proceeding with any investment or project related to any information, products or companies described herein. When foreign translations exist, the English document is considered authoritative. To assure accuracy, only use documents downloaded directly from Tolly.com. No part of any document January be reproduced, in whole or in part, without the specific written permission of Tolly. All trademarks used in the document are owned by their respective owners. You agree not to use any trademark in or as the whole or part of your own trademarks in connection with any activities, products or services which are not ours, or in a manner which January be confusing, misleading, or deceptive or in a manner that disparages us or our information, projects or developments.
![Post thumbnail](https://cdn-prod.scdn6.secure.raxcdn.com/static/media/c47855b3-560c-4428-ad9d-d8f2d27426ba.jpeg?_cb=1709923137.3666503)
Gen 7 Emulex® HBAs by Broadcom® Application Advantage for Dell R7625 AMD EPYC Servers
Tue, 02 Apr 2024 23:05:59 -0000
|Read Time: 0 minutes
Dell PowerEdge R7625 servers with AMD EPYC processors & Emulex 64G Fibre Channel LPe36002 Host Bus Adapters demonstrate Application Advantages
Executive Summary
New generation technology can be expected to improve performance. There are times, however, when multiple technology advances can combine to provide an outsized advantage. Such is the case when the Dell PowerEdge R7625 Rack Server is combined with the Broadcom Emulex LPe36002 64G Fibre Channel Host Bus Adapter.
Dell commissioned Tolly to benchmark the database performance of the Broadcom Emulex LPe36002 64G Fibre Channel dual-port host bus adapter (HBA) running in the Dell PowerEdge R7625 Rack Server and compare that to the same combined workload performance running in four separate, R740- class servers each outfitted with a 16G FC HBA as was standard with that server generation.
Following is a summary of the 4 tests conducted:
- The first test measured HammerDB “TPROC-C” Online Transaction Processing (OLTP) workload performance with Microsoft SQL Server Database to compare the NVMe/FC vs SCSI/FC performance on a Dell PowerEdge R7625 server with Broadcom Emulex LPe36002 64G Fibre Channel HBA.
Key Findings:
- Improved database transactions by up to 38%
- Reduced database stored procedure latency by up to 35%
- Improved server CPU efficiency by up to 50%
READ THE FULL STUDY HERE:
2. The second test measured the HammerDB “TPROC-H” Decision Support System (DSS) analytics workload queries on a single Dell R7625 AMD EPYC-based platform and found that it pushed Emulex 64G Fibre Channel HBA to full line rate performance of 64G Fibre Channel, thus matching the combined application throughput of four previous generation R740-class Purley platform servers using 16G Fibre Channel HBAs.
Key Findings:
- Impressive database analytics throughput consolidation- from four R740 servers with 16G Fibre Channel HBAs to a single R7625 with 64G Fibre Channel HBA
- Consolidating analytics workload can significantly reduce I/O bound query times
READ THE FULL STUDY HERE:
3. The third test revealed a 4:1 server consolidation benefit for Virtualization workloads where a single Dell R7625 AMD EPYC-based platform with 64G Fibre channel HBA matched the combined application throughput of four Dell R740-class Purley platform servers using 16G Fibre Channel HBAs.
Key Findings:
- Consolidation of virtual machine (VM) “Boot Storm” - Virtualization workloads throughput from four Dell R740 servers with 16GFC HBA to a singleDell R7625 with Emulex 64G Fibre Channel
- A VDI boot storm is the consumption of compute and disk I/O resources during the initial startup of end-user desktop virtual images that results in poor performance for all users. VDI environments need read I/O at boot (Bootstorm).
READ THE FULL STUDY HERE:
4. The final test determined that the Dell R7625 with PCIe Gen5 and Emulex 64G Fibre Channel HBA combined to overcome bottlenecks for Oracle database HammerDB “TPROC-H” DSS analytics workload queries, achieving maximum throughput
Key Findings:
- R7625 with 64GFC HBA can achieve 4x the database analytics throughput of the16GFC HBA and 2x the throughput of the 32GFC HBA
- 42% improvement in complex database ad hoc query processing time when running the dual-port 64GFC HBA on the PCIe 5.0-based R7625 server compared to the older generation R740 server