On the record for Sapphire: World Record SAP HANA Performance with Dell PowerEdge R760 Servers
Wed, 17 May 2023 15:19:25 -0000
|Read Time: 0 minutes
SAP HANA is an in-memory database platform used to manage large amounts of data in real time for purposes such as point-of-sale data, real-time analytics for inventory management, supply-chain optimization, and customer behavior analysis. As the amount of data that must be processed grows, servers can be challenged to deliver information and analysis quickly enough to meet the increasingly demanding business requirements for fast data access. That is why SAP HANA users are always on the lookout for better-performing servers.
SAP publishes the results of standardized benchmark tests to assist customers in comparing the performance of different servers when running SAP HANA. Performance results for the Dell PowerEdge R760 server were recently published, and we are excited to share some highlights about how well this server stacks up against other servers.
Fifteen different data points are available for comparing the PowerEdge R760 server to other servers. The standard benchmarks measure three key performance data points for each of the five different database sizes on which the PowerEdge R760 server was tested.
The following table shows the rank of the PowerEdge R760 server for each of the 15 points of comparison among all the different servers tested using SAP’s benchmark version 3. The source of the data is the publicly available SAP Standard Application Benchmarks directory, accessed on 04-19-2023.
The PowerEdge R760 server outperformed all the other servers in 13 of the 15 benchmark points of comparison, and it ranked second in the remaining two.
The PowerEdge R760 server is a two-socket server built with the latest 4th Gen Intel Xeon Scalable processors. It outperforms all other servers in SAP HANA benchmarking—even 4 and 8-socket servers—in every database size of up to 6.5 billion initial records. It provides the performance and versatility to address your most demanding applications, including SAP HANA, with massive databases and mission-critical requirements for real-time performance.
Look for an in-depth study from Prowess: Remarkable SAP Benchmark Performance Results for Dell PowerEdge R760 Servers (delltechnologies.com) about how the PowerEdge R760 server performed against top competitors on the SAP Standard Application Benchmarks.
Learn more about PowerEdge servers and Dell Technologies solutions for SAP.
About the Author:
Seamus Jones
Director, Server Technical Marketing
Seamus serves Dell as Director of Server Technical Marketing, seasoned with over 20 years of real-world experience in both North America and EMEA. His unique perspective comes from experience consulting customers on data center initiatives and server virtualization strategies.
Related Blog Posts
Improving Oracle Performance with New Dell 4 Socket Servers
Mon, 02 Oct 2023 21:23:14 -0000
|Read Time: 0 minutes
Deploying Oracle on a PowerEdge Server presents various challenges, typical of intricate software and hardware integrations. As servers age, they become increasingly expensive to maintain and can have detrimental effects on business productivity. This is primarily due to the heightened demand for IT personnel's time and the heightened risk of unscheduled downtime.
In the case of older servers hosting virtualized Oracle® Database applications, they might struggle to keep up with growing usage demands. This can result in slower operations that, for example, dissuade customers from browsing a website for products and completing online transactions. Aging hardware is also more susceptible to data loss or corruption, potential security vulnerabilities, and elevated maintenance and repair expenses.
One effective solution to address these issues is migrating Oracle Database workloads from older servers to newer ones, such as the 16th Generation Dell™ PowerEdge™ R960 featuring 4th Gen Intel® Xeon® Scalable processors. This upgrade not only mitigates the aforementioned concerns but also opens doors to further IT enhancements and facilitates the achievement of business objectives. It can lead to improved customer responsiveness and quicker time-to-market.
Additionally, transitioning workloads from virtualized environments to bare metal solutions has the potential to significantly enhance transactional database performance, particularly for databases that come with high-performance service-level agreements (SLAs).
We recently submitted one of the new Dell PowerEdge R960’s to Principled Technologies for testing with an Oracle database 19c and compared the results to previous generations. The performance exceeded even our own lofty expectations. As the graph below shows, the R960 was able to process almost 8 times more transactions than the PowerEdge R920 and over double the transactions of the previous generation PowerEdge R940.
For more details read the full report here: Improving Oracle Database performance: Moving to Dell PowerEdge R960 servers with Intel processors | Dell Technologies Info Hub
When refreshing platforms it is critical to consider these performance characteristics and other common top 10 concerns such as:
- Compatibility Issues: Ensuring that the version of Oracle Database you want to deploy is compatible with the specific PowerEdge server hardware, operating system, and other software components can be a significant challenge. Compatibility matrices provided by Oracle and Dell (the manufacturer of PowerEdge servers) need to be thoroughly reviewed.
- Hardware Selection: Selecting the right PowerEdge server model with the appropriate CPU, memory, storage, and networking capabilities to meet the performance and scalability requirements of Oracle can be tricky. Overestimating or underestimating these requirements can lead to performance bottlenecks or wasted resources.
- Operating System Configuration: Configuring the operating system (typically, a Linux distribution like Oracle Linux or Red Hat Enterprise Linux) to meet Oracle's specific requirements can be complex. This includes setting kernel parameters, file system configurations, and installing necessary packages.
- Storage Configuration: Setting up storage correctly is critical for Oracle databases. Customers need to configure RAID levels, partitioning, and file systems optimally. Ensuring high I/O throughput and low latency is essential for database performance.
- Network Configuration: Proper network configuration, including setting up the network stack and configuring firewalls, is important for database security and accessibility.
- Oracle Database Configuration: Configuring Oracle Database itself, including memory allocation, database parameters, and storage management, requires a deep understanding of Oracle's architecture and best practices. Misconfigurations can lead to poor performance and stability issues.
- Backup and Recovery Strategy: Developing a robust backup and recovery strategy is crucial to protect the database against data loss. This includes configuring Oracle Recovery Manager (RMAN) and ensuring that backups are performed regularly and can be restored successfully.
- High Availability and Disaster Recovery: Implementing high availability and disaster recovery solutions, such as Oracle Real Application Clusters (RAC) or Data Guard, can be complex and requires careful planning and testing.
- Licensing and Compliance: Managing Oracle licenses and ensuring compliance with Oracle's licensing policies can be challenging, especially in virtualized or clustered environments.
- Performance Tuning: Continuously monitoring and tuning the Oracle database and the underlying PowerEdge server to optimize performance can be an ongoing challenge. This includes identifying and addressing performance bottlenecks and ensuring that the hardware is used efficiently.
To address these challenges, it is often advisable for customers to work with experienced system administrators, database administrators, and consultants who have expertise in both Oracle and PowerEdge server deployments. Additionally, staying informed about the latest updates, patches, and best practices from Oracle and Dell can help mitigate potential issues all of which can be found by partnering with Dell Technologies to take advantage of these performance enhancements found within the PowerEdge R960.
Author: Seamus Jones
Director, Server Technical Marketing Engineering
MLPerf™ Inference v4.0 Performance on Dell PowerEdge R760xa and R7615 Servers with NVIDIA L40S GPUs
Fri, 05 Apr 2024 17:41:56 -0000
|Read Time: 0 minutes
Abstract
Dell Technologies recently submitted results to the MLPerf™ Inference v4.0 benchmark suite. This blog highlights Dell Technologies’ closed division submission made for the Dell PowerEdge R760xa, Dell PowerEdge R7615, and Dell PowerEdge R750xa servers with NVIDIA L40S and NVIDIA A100 GPUs.
Introduction
This blog provides relevant conclusions about the performance improvements that are achieved on the PowerEdge R760xa and R7615 servers with the NVIDIA L40S GPU compared to the PowerEdge R750xa server with the NVIDIA A100 GPU. In the following comparisons, we held the GPU constant across the PowerEdge R760xa and PowerEdge R7615 servers to show the excellent performance of the NVIDIA L40S GPU. Additionally, we also compared the PowerEdge R750xa server with the NVIDIA A100 GPU to its successor the PowerEdge R760xa server with the NVIDIA L40S GPU.
System Under Test configuration
The following table shows the System Under Test (SUT) configuration for the PowerEdge servers.
Table 1: SUT configuration of the Dell PowerEdge R750xa, R760xa, and R7615 servers for MLPerf Inference v4.0
Server | PowerEdge R750xa | PowerEdge R760xa | PowerEdge R7615 |
MLPerf Version | V4.0
| ||
GPU | NVIDIA A100 PCIe 80 GB | NVIDIA L40S
| |
Number of GPUs | 4 | 2 | |
MLPerf System ID | R750xa_A100_PCIe_80GBx4_TRT | R760xa_L40Sx4_TRT | R7615_L40Sx2_TRT
|
CPU | 2 x Intel Xeon Gold 6338 CPU @ 2.00GHz | 2 x Intel Xeon Platinum 8470Q | 1 x AMD EPYC 9354 32-Core Processor |
Memory | 512 GB | ||
Software Stack | TensorRT 9.3.0 CUDA 12.2 cuDNN 8.9.2 Driver 535.54.03 / 535.104.12 DALI 1.28.0 |
The following table lists the technical specifications of the NVIDIA L40S and NVIDIA A100 GPUs.
Table 2: Technical specifications of the NVIDIA A100 and NVIDIA L40S GPUs
Model | NVIDIA A100 | NVIDIA L40S | ||
Form factor | SXM4 | PCIe Gen4 | PCIe Gen4 | |
GPU architecture | Ampere | Ada Lovelace | ||
CUDA cores | 6912 | 18176 | ||
Memory size | 80 GB | 48 GB | ||
Memory type | HBM2e | HBM2e | ||
Base clock | 1275 MHz | 1065 MHz | 1110 MHz | |
Boost clock | 1410 MHz | 2520 MHz | ||
Memory clock | 1593 MHz | 1512 MHz | 2250 MHz | |
MIG support | Yes | No | ||
Peak memory bandwidth | 2039 GB/s | 1935 GB/s | 864 GB/s | |
Total board power | 500 W | 300 W | 350 W |
Dell PowerEdge R760xa server
The PowerEdge R760xa server shines as an Artificial Intelligence (AI) workload server with its cutting-edge inferencing capabilities. This server represents the pinnacle of performance in the AI inferencing space with its processing prowess enabled by Intel Xeon Platinum processors and NVIDIA L40S GPUs. Coupled with NVIDIA TensorRT and CUDA 12.2, the PowerEdge R760xa server is positioned perfectly for any AI workload including, but not limited to, Large Language Models, computer vision, Natural Language Processing, robotics, and edge computing. Whether you are processing image recognition tasks, natural language understanding, or deep learning models, the PowerEdge R760xa server provides the computational muscle for reliable, precise, and fast results.
Figure 1: Front view of the Dell PowerEdge R760xa server
Figure 2: Top view of the Dell PowerEdge R760xa server
Dell PowerEdge R7615 server
The PowerEdge R7615 server stands out as an excellent choice for AI, machine learning (ML), and deep learning (DL) workloads due to its robust performance capabilities and optimized architecture. With its powerful processing capabilities including up to three NVIDIA L40S GPUs supported by TensorRT, this server can handle complex neural network inference and training tasks with ease. Powered by a single AMD EPYC processor, this server performs well for any demanding AI workloads.
Figure 3: Front view of the Dell PowerEdge R7615 server
Figure 4: Top view of the Dell PowerEdge R7615 server
Dell PowerEdge R750xa server
The PowerEdge R750xa server is a perfect blend of technological prowess and innovation. This server is equipped with Intel Xeon Gold processors and the latest NVIDIA GPUs. The PowerEdge R760xa server is designed for the most demanding AI, ML, and DL workloads as it is compatible with the latest NVIDIA TensorRT engine and CUDA version. With up to nine PCIe Gen4 slots and availability in a 1U or 2U configuration, the PowerEdge R750xa server is an excellent option for any demanding workload.
Figure 5: Front view of the Dell PowerEdge R750xa server
Figure 6: Top view of the Dell PowerEdge R750xa server
Performance results
Classical Deep Learning models performance
The following figure presents the results as a ratio of normalized numbers over the Dell PowerEdge R750xa server with four NVIDIA A100 GPUs. This result provides an easy-to-read comparison of three systems and several benchmarks.
Figure 7: Normalized NVIDIA L40S GPU performance over the PowerEdge R750xa server with four A100 GPUs
The green trendline represents the performance of the Dell PowerEdge R750xa server with four NVIDIA A100 GPUs. With a score of 1.00 for each benchmark value, the results have been divided by themselves to serve as the baseline in green for this comparison. The blue trendline represents the performance of the PowerEdge R760xa server with four NVIDIA L40S GPUs that has been normalized by dividing each benchmark result by the corresponding score achieved by the PowerEdge R750xa server. In most cases, the performance achieved on the PowerEdge R760xa server outshines the results of the PowerEdge R750xa server with NVIDIA A100 GPUs, proving the expected improvements from the NVIDIA L40S GPU. The red trendline has also been normalized over the PowerEdge R750xa server and represents the performance of the PowerEdge R7615 server with two NVIDIA L40S GPUs. It is interesting that the red line almost mimics the blue line. This result suggests that the PowerEdge R7615 server, despite having half the compute resources, still performs comparably well in most cases, showing its efficiency.
Generative AI performance
The latest submission saw the introduction of the new Stable Diffusion XL benchmark. In the context of generative AI, stable diffusion is a text to image model that generates coherent image samples. This result is achieved gradually by refining and spreading out information throughout the generation process. Consider the example of dropping food coloring into a large bucket of water. Initially, only a small, concentrated portion of the water turns color, but gradually the coloring is evenly distributed in the bucket.
The following table shows the excellent performance of the PowerEdge R760xa server with the powerful NVIDIA L40S GPU for the GPT-J and Stable Diffusion XL benchmarks. The PowerEdge R760xa takes the top spot in GPT-J and Stable Diffusion XL when compared to other NVIDIA L40S results.
Table 3: Benchmark results for the PowerEdge R760xa server with the NVIDIA L40S GPU
Benchmark | Dell PowerEdge R760xa L40S result (Server in Queries/s and Offline in Samples/s) | Dell’s % gain to the next best non-Dell results (%) |
Stable Diffusion XL Server | 0.65 | 5.24 |
Stable Diffusion XL Offline | 0.67 | 2.28 |
GPT-J 99 Server | 12.75 | 4.33 |
GPT-J 99 Offline | 12.61 | 1.88 |
GPT-J 99.9 Server | 12.75 | 4.33 |
GPT-J 99.9 Offline | 12.61 | 1.88 |
Conclusion
The MLPerf Inference submissions elicit insightful like-to-like comparisons. This blog highlights the impressive performance of the NVIDIA L40S GPU in the Dell PowerEdge R760xa and PowerEdge R7615 servers. Both servers performed well when compared to the performance of the Dell PowerEdge R750xa server with the NVIDIA A100 GPU. The outstanding performance improvements in the NVIDIA L40S GPU coupled with the Dell PowerEdge server position Dell customers to succeed in AI workloads. With the advent of the GPT-J and Stable diffusion XL Models, the Dell PowerEdge server is well positioned to handle Generative AI workloads.