DDR5 Memory Bandwidth for Next-Generation PowerEdge Servers Featuring 4th Gen AMD EPYC Processors
Download PDFWed, 03 May 2023 15:49:23 -0000
|Read Time: 0 minutes
Summary
Dell Technologies has announced some exciting new servers featuring the latest 4th Gen AMD EPYC processors. These servers come in 1- and 2-socket versions in 1U and 2U form factors. Each socket supports up to 12 DIMMs at speeds of up to 4,800 MT/s. This document compares the memory bandwidth readings observed with these new servers against previous-generation servers running 3rd Gen AMD EPYC processors.
4th Gen AMD EPYC memory architecture
The 4th Gen AMD EPYC processors are the first AMD x86 server processors to support DDR5 memory. The CPUs themselves still have a chiplet design with a central I/O chiplet surrounded by compute chiplets. The memory runs at speeds of up to 4,800 MT/s, which is 50 percent faster than the 3,200 MT/s that the previous 3rd Gen AMD EPYC processors supported.
One other significant difference is in the number of populated slots. The 3rd Gen AMD EPYC processors supported up to 16 DIMMs per socket in a 2 DIMMs per channel configuration or 8 DIMMs per socket in a 1 DIMM per channel configuration. The 2 DIMMs per channel configuration supported a maximum speed of 2,933 MT/s.
Memory bandwidth test
To quantify the impact of this increase in memory support, we performed two studies.1 The first study (see Figure 1) measured memory bandwidth determined by the number of DIMMs per CPU populated. To measure the memory bandwidth, we used the STREAM Triad benchmark. STREAM Triad is a synthetic benchmark that is designed to measure sustainable memory bandwidth (in MB/s) and a corresponding computation rate for four simple vector kernels. Of all the vector kernels, Triad is the most complex scenario. We ran the benchmark on the following systems:
- Previous-generation Dell PowerEdge R7525 powered by AMD’s 3rd Gen EPYC CPUs populated with up to 16 DDR4 3,200 MT/s DIMMs per channel
- Latest-generation Dell PowerEdge R7625 powered by AMD’s 3rd Gen EPYC CPUs populated with up to 12 DDR5 4,800 MT/s DIMMs per socket
We used default BIOS configurations for this test.
The following figures show the system aggregate memory bandwidth across two CPUs:
Figure 1. System aggregate memory bandwidth trends with DIMM population for 4th Gen AMD EPYC processor-based PowerEdge servers with default BIOS settings
Figure 2. System aggregate memory bandwidth trends with DIMM population for 3rd Gen AMD EPYC processor-based PowerEdge servers with default BIOS settings
Consider that a fully balanced configuration requires all DIMM channels to be populated—that is 8 DIMMs for the 3rd Gen and 12 DIMMs for the 4th Gen. Given these differences, it is challenging to do a direct comparison. However, if we compare the numbers for a balanced configuration with 1 DIMM per channel, we see a 112 percent increase in bandwidth. With just 8 channels populated in both cases, we see a 45 percent increase in bandwidth. Despite this not being a balanced configuration, we still see a significant performance increase at this point.
Figure 3. System aggregate memory bandwidth trends with DIMM population for 4th Gen AMD EPYC processor-based PowerEdge servers with tuned BIOS settings
We collected a second series of datapoints on the R7625 with BIOS settings adjusted for best memory performance. This included setting the NPS setting to NPS4 and disabling CCX as NUMA. With these settings, we see that the maximum bandwidth of the R7625 further increases by another 14.5 percent to a class-leading 789 GB/s.
Conclusion
With up to 96 cores per socket and significant increases in memory bandwidth, Dell PowerEdge servers with 4th Gen AMD EPYC processors continue to provide best-in-class features and specifications to satisfy the most demanding workloads.
1 Tests were performed in January 2023 at the Solutions and Performance Analysis Lab at Dell Technologies.
Related Documents
Save Time, Rack Space, and Money—5:1 Server Consolidation Made Possible with the Latest AMD EPYC Processors
Thu, 20 Apr 2023 17:41:37 -0000
|Read Time: 0 minutes
Summary
The latest Dell PowerEdge servers with AMD EPYC 4th Generation processors, each with up to 96 cores, deliver exceptional value to our customers. The large number of cores coupled with the high-speed DDR5 memory and very high-speed PCIe Gen5 devices makes for servers that can run almost any workload with ease. These servers are especially well suited for virtualization workloads. These unprecedented performance enhancements enabled Dell Technologies to achieve multiple virtualization world records. The cluster-level benchmarks for virtualized workloads are an excellent example of the performance and power-performance world record gains that are achievable.
Running a mixture of architectures in your data center can be cause for some concern—especially if you are looking to upgrade to the latest AMD servers and you are currently running the workloads on legacy Intel® based servers. Even with the greatest level of planning, there is always the fear that some unexpected variable might turn everything upside down during the migration process. Now, there is a new tool for your toolbox to make such migrations easier. The VMware Architecture Migration Tool1 is a PowerShell script that uses VMware PowerCLI to eliminate the guesswork and complexity involved in migrating a virtual machine from one hardware architecture to another.
To fully test the tool, Dell ran a full migration scenario. We were able to consolidate 380 VMs running on five legacy Intel platform servers into one Dell PowerEdge R7625 with AMD EPYC 4th Gen processors. We describe our testing in more detail later in this paper.
Why migrate?
In today’s IT departments, workloads are always evolving. There is increasing pressure to support new workloads while keeping existing workloads to support existing business needs—all while also trying to reduce costs and meet corporate goals.
The latest technology tends to bring multiple advantages, driving the need to upgrade. Some of these advantages are:
- Higher performance
The latest Dell PowerEdge servers with 4th Gen AMD EPYC processors have class-leading performance with up to 121 percent higher scores than prior generations.2
- Better efficiency
The Dell PowerEdge servers with 4th Gen AMD EPYC processors are some of the first to achieve the EPEAT silver rating, indicating the highest level of environmental responsibility and efficiency. Dell has achieved 159 percent higher performance per kilowatt on the VMmark benchmark with the R7625 compared to the prior-generation model server.3
- More security
With Dell’s Cyber Resilient Architecture and AMD’s Infinity Guard, the PowerEdge servers with 4th Gen AMD EPYC processors offer top-class security to ensure that your data and infrastructure are protected.4
- Workload optimizations
The 4th Generation AMD EPYC processors have several optimizations, such as support for AVX-512, INT8, and BFLOAT16. The processors can deliver exceptional performance for workloads that can take advantage of such optimizations.
VMware Architecture Migration Tool
The VMware Architecture Migration Tool (VAMT) was developed jointly by AMD and VMware to automate the migration of legacy VMs from Intel architecture to AMD architecture, with the goal of delivering a better user experience and better business value. Freely available on GitHub, VAMT offers several key features:
- Architecture agnostic and open source
- Fully automated cold migration
- VM success validation
- Process throttling
- Change window support
- Email and syslog support
- Audit trail
- Rollback
The tool streamlines and simplifies the migration process in a trustworthy fashion.
Benchmarking
Dell leveraged the VAMT tool and the VMmark benchmark to achieve some remarkable consolidation on the PowerEdge R7625.
The VMmark benchmark allowed us to set up a workload in the form of tiles within each hardware cluster. Each tile consisted of 19 different VMs running a workload internally. The benchmark was deployed across five legacy Intel based servers and eventually migrated to a single AMD based PowerEdge server. A Dell PowerMax 2000 SAN was used for data storage. The following table shows the configuration details:
Table 1. Configuration of source and target servers
Component/specification | Source | Target |
---|---|---|
Number of servers | 5 | 1 |
Processor | Intel 8180 | AMD EPYC 9654 |
Cores per server | 56 | 192 |
Memory | 768 GB | 3 TB |
Tiles | 4 | 20 |
VMs per server | 76 | 380 |
Server | Server vendor A | Dell PowerEdge R7625 |
Storage | PowerMax 2000; 30 TB spread across 6 LUNs | |
Network | 32 GB FC network for storage, 25 GbE for data network on VMs through a 4-way splitter, 100 Gb switch |
We were able to run four tiles per legacy server for a total of 380 VMs. The VAMT was then used to migrate the VMs across to the target PowerEdge server.
The tool completed a cold migration of all 380 VMs to the target server in 57 minutes!
Achieving value
The Dell PowerEdge R7625 with 4th Gen AMD EPYC processors delivers significant technology advancements that can deliver value in any virtualized deployment. Consolidating from five servers to a single server is an example of the extent of savings possible. This kind of consolidation allows for significant license cost savings and fewer hours on system management. Decommissioning the five legacy systems also reduces power draw and operational costs by as much 64 percent,5 even while also running workloads on the latest architecture with security features like Secure Memory Encryption (SME) and Secure Encrypted Virtualization (SEV). AMD SEV helps safeguard privacy and integrity by encrypting each virtual machine.
1 https://github.com/vmware-samples/vmware-architecture-migration-tool
2 Based on Dell analysis of submitted SPECFPRate score of 1410 achieved on a Dell PowerEdge R7625 with AMD EPYC 9654s compared to the previous high score of 636 on a Dell PowerEdge R7525 with AMD EPYC 7763 processors as of 11/3/2022. Actual performance might vary.
3 Based on Dell analysis of published VMmark Server Power-Performance score of 21.0179@21 tiles achieved on a Dell PowerEdge R7615 cluster with AMD EPYC 9654P processors compared to the score of 8.1263@12 tiles achieved on a Dell PowerEdge R7515 cluster with the AMD EPYC 7763 processors as of 4/13/2023. Actual performance might vary.
5 Based on Dell internal analysis comparing the total CPU TDP of 2,050 W from five dual-socket servers with the Intel Xeon 8180 processors compared to the total CPU TDP of 720 W from a single dual-socket Dell PowerEdge server with AMD EPYC 9654 processors as of 4/13/2023. Actual performance might vary.
VDI on Dell PowerEdge Infrastructure with 4th Generation AMD EPYC Processors
Fri, 14 Apr 2023 15:17:26 -0000
|Read Time: 0 minutes
Summary
Dell PowerEdge server improvements for VDI
The all-new Dell PowerEdge R7625 with AMD EPYC 4th Gen processors delivers up to 50 percent higher CPU density in terms of cores per server. This platform is based on the latest technology from AMD to provide better performance and improved scalability for a variety of workloads, including VDI.
Some of the platform enhancements that are especially relevant to VDI workloads are:
- CPU—Up to 50 percent more cores with up to 96 cores per socket, allowing VDI virtual machine (VM) per-node density increases and better VDI VM performance.
- Memory—33 percent more memory channels with 50 percent faster memory, allowing greater memory capacity and performance to support richer VDI desktop VM configurations for applications that require increased memory.
- I/O—PCIe Gen5 with twice the bandwidth, allowing for high-speed and low-latency NVMe drives, NICs, and GPU accelerators.
- Smart Cooling Technology—Advanced thermal designs and options, such as streamlined airflow pathways within the server, liquid cooling options, and so on, to keep CPUs, high-performance NICs, and GPUs cool and performing optimally.
- Boot Optimized Storage—The 3rd generation Boot Optimized Storage Solution (BOSS-N1), which has been enhanced with full hot-plug support for enterprise class M.2 NVMe SSDs. Additionally, the design is integrated into the server, eliminating the need to dedicate a PCIe slot and giving customers more flexibility with their choice of I/O slots and peripherals.
Benchmarking for VDI
Login VSI by Login Consultants is the industry-standard tool for testing VDI environments and server-based computing (RDSH environments). It installs a standard collection of desktop application software (for example, Microsoft Office, Adobe Acrobat Reader) on each VDI desktop. It then uses launcher systems to connect a specified number of users to available desktops within the environment. Once each user is connected, the workload is started through a login script, which starts the test script once the user environment is configured by the login script. Each launcher system can launch connections to several ‘target’ machines (VDI desktops).
When designing a desktop virtualization solution, understanding user workloads and profiles is key to understanding the density numbers that the solution can support. At Dell Technologies, we use several Login VSI workload/profile levels, each of which is bound by specific metrics and capabilities, with two targeted at graphics-intensive use cases.
To understand the improvements that we can expect to see with the latest generation of servers compared with the prior-generation servers, we ran the same Login VSI benchmark against both servers. We used a Knowledge Worker profile consisting of 5 to 9 applications and 360p video. The following table shows the user VM configuration:
Table 1. Login VSI Knowledge Worker profile
Workload | VM profiles | ||||
vCPUs | RAM | RAM reserved | Desktop video resolution | Operating system | |
Knowledge Worker | 2 | 4 GB | 2 GB | 1920 x 1080 | Windows 10 Enterprise 64-bit |
The following table outlines the test configuration of the hardware and software components:
Table 2. Hardware and software configuration
Component | Configuration |
Compute host hardware |
|
Management host hardware |
|
Storage | PERC with 6x mixed use SSDs (RAID 10) |
Network | Dell S5248-ON switch |
Broker | VMware Horizon 8 2209 |
Hypervisor | VMware ESXi 8.0.0 |
SQL | Microsoft SQL Server 2019 |
Desktop operating system | Microsoft Windows 10 Enterprise 64-bit, 22h2 version |
Office | Microsoft Office 365 |
Profile management | FSLogix |
Management operating system | Windows Server 2022 |
Login VSI | Version 4.1.40.1 |
Results summary—R7525 compared with R7625
Comparing the 32 core processors of the 4th Gen AMD EPYC to the 3rd Gen AMD EPYC using Login VSI showed approximately 30 percent improvement in VM density. At the same time, we observed approximately 11 percent improvement in response time.
The following table outlines the test results:
Table 3. Key results of Login VSI testing
Server | Density per host (higher is better) | User experience—VSI base (lower is better) |
PowerEdge R7525 | 265 VMs | 896 milliseconds |
PowerEdge R7625 | 345 VMs | 794 milliseconds |
Conclusion
With up to 96 cores per socket and significant increases in memory bandwidth, Dell PowerEdge servers with 4th Gen AMD EPYC processors continue to provide best-in-class features and specifications to satisfy the most demanding workloads. For VDI workloads, with the same number of cores, we observed a 30 percent increase in density with more than 11 percent reduction in response time.