Unlock New MX CPU and Storage Configurations with a Thermally Optimized Air-Cooled Chassis
Download PDFFri, 03 Mar 2023 20:08:02 -0000
|Read Time: 0 minutes
As the server industry trend of increasing CPU power goes on, Dell Technologies continues to offer customers feature-rich air-cooled configurations. Dell Engineering has applied thermal innovation and machine learning to the Dell PowerEdge MX chassis to support the MX760c server sled with a broad range of 4th Gen Intel® Xeon® Scalable processors and local storage configurations.
This Direct from Development tech note describes the new capabilities using air cooling that Dell has added to the PowerEdge MX configurations.
Introduction
The PowerEdge MX7000 is a modular chassis that allows customers to build a set of compute, storage, networking, and management to meet their specific workload needs. Industry trends of new technologies, including CPUs increasing power per server sled, continually push the capability to air-cool feature-rich configurations. Dell Engineering used machine learning combined with next-generation fans to offer high-performance 4th Gen Intel® Xeon® Scalable processors in an air-cooled chassis with more local storage configurations than previously available.
Dell Engineering expertise
There are 8! = 40,320 modular sled permutations in the 8-slot MX chassis. Dell Engineering conducted a Design of Experiments (DOE) to train a machine learning model that dynamically calculates the airflow cooling capacity for each of the eight slots. This technology enables Dell to maximize the shared cooling infrastructure of the MX7000, unlocking configurations that were previously not possible, and provide clear guidance to customers about how to thermally optimize their chassis. When a chassis configuration is optimized for cooling, the fans run more efficiently at lower speeds across the server workload, which lowers fan power, reduces cooling costs, and decreases acoustics of the chassis.
Thermally optimized chassis
The ability of the MX7000 chassis to air-cool the eight slots is directly affected by the storage configuration of each sled as well as the placement of sleds in the chassis. For example: Pulling air through a sled that has six hard drives is harder than with a sled that has four hard drives. Machine learning is built into the sled and chassis firmware to dynamically analyze the ability of the chassis to deliver air-cooling to each sled.
A consistent storage configuration maximizes cooling across all sleds and enables the MX760c to support up to six 2.5-in. storage devices with the latest 4th Gen Intel® Xeon® CPUs.
A varied storage configuration with MX760c sleds enables support for up to four 2.5-in. storage devices to maximize cooling through each sled.
MX7000 air-cooling enhancements
MX7000 chassis and MX sleds introduced the capability to dynamically calculate the cooling based on the chassis configuration. This capability enabled Dell to offer a thermally optimized chassis with a consistent storage configuration that increases cooling for sleds by 20 percent. Dell used this additional cooling capability to offer high-power CPUs with storage configurations that were not supported by previous generations.
The industry trend of increasing power per node every generation has significantly challenged the ability to deliver air-cooled solutions. The MX7000 chassis introduced the next-generation Gold Grade chassis fans with the MX760c sleds to provide an air-cooled solution with the latest high-powered CPUs. Gold Grade fans deliver 25 percent more cooling per sled than the previous-generation Silver Grade fans.
Enterprise Infrastructure Planning Tool
The Dell Enterprise Infrastructure Planning Tool (EIPT) helps IT professionals plan and tune their systems and infrastructure for maximum efficiency. Customers can model their customized MX7000 chassis and sled configurations in EIPT. The trained machine learning model enables the tool to identify the maximum data center ambient temperature supported by the sleds. It also identifies the most thermally optimized configuration when sleds have a varied storage configuration. This means that new and existing customers can identify the most efficient sled-to-slot configuration to optimize their chassis for maximum cooling capability while lowering power, costs, and fan noise.
Conclusion
Dell continues to deliver innovative solutions that expand the air-cooled feature-rich configuration choices for the PowerEdge MX7000 chassis and server sleds. Dell Engineering combined machine learning technology with next-generation fans to provide customers the latest high-performance CPUs with more local storage configurations than previous generations in an air-cooled chassis. In addition to the expanded air-cooling configurations, Dell also offers Direct Liquid Cooling (DLC) for the PowerEdge MX7000 chassis and server sleds. The features and potential benefits of DLC are discussed in a separate Direct from Development tech note.
References
- Tech Talk Video: The MX7000 Introduces a New Thermal Innovation
- Direct from Development Tech Note: The History of Server and Data Center Cooling Technologies
Related Documents
Are Rugged Compact Platforms Ready for Edge AI?
Thu, 14 Mar 2024 16:47:06 -0000
|Read Time: 0 minutes
Scalers AI™ tested the Impellers Defect Inspection at the Edge on the Dell PowerEdge XR5610 server. Impellers are rotating components used in various industrial processes, including fluid handling in pumps and fans. Quality inspection of impellers is crucial to ensure their reliable performance and durability.
Dell™ PowerEdge™XR5610 Server supports 50 simultaneous streams running AI defect detection in a single CPU config with Dell™PowerEdge™ XR Portfolio offering scalability to 4CPUs at a near-linear scale. 1.4x Gen on Gen Performance Improvement Using Intel® Deep Learning Boost and Intel® OpenVINO™ Smart Factory Solution | Defect Detection Solution.
Dell™ PowerEdge™ XR 5610 servers, equipped with fourth Gen Intel® Xeon® scalable processors, are well suited to handle Edge AI applications with both AI inference and training at the Edge. The rugged form factor, extended temp, and scalability to four sockets enable compute to be deployed in the physical world closer to the point of data creation, allowing for near-real-time insights.
Fast-track development with access to the solution code:
Contact your Dell™ representative or Scalers AI™ at contact@scalers.ai for access. Save hundreds of hours of development with the solution code. As part of this effort, ScalersAI™ is making the solution code available.
Next Generation Dell PowerEdge XR7620 Server Machine Learning (ML) Performance
Fri, 03 Mar 2023 19:57:26 -0000
|Read Time: 0 minutes
Summary
Dell Technologies has recently announced the launch of next-generation Dell PowerEdge servers that deliver advanced performance and energy efficient design.
This Direct from Development (DfD) tech note describes the new capabilities you can expect from the next-generation Dell PowerEdge servers powered by Intel 4th Gen Intel® Xeon® Scalable processors MCC SKU stack. This document covers the test and results for ML performance of Dell’s next generation PowerEdge XR 7620 using the industry standard MLPerf Inference v2.1 benchmarking suite. XR7620 has target workloads in manufacturing, retail, defense, and telecom - all key workloads requiring AI/ML inferencing capabilities at the edge.
With up to 2x300W accelerator cards for GPUs to handle your most demanding edge workloads, XR7620 provides a 45% faster image classification workload as compared to the previous generation Dell XR 12 server with just one 300W GPU accelerator for the ML/AI scenarios at the enterprise edge. The combination of low latency and high processing power allows for faster and more efficient analysis of data, enabling organizations to make real-time decisions for more opportunities.
Edge computing
Edge computing, in a nutshell, brings computing power close to the source of the data. As the Internet of Things (IoT) endpoints and other devices generate more and more time-sensitive data, edge computing becomes increasingly important. Machine Learning (ML) and Artificial Intelligence (AI) applications are particularly suitable for edge computing deployments. The environmental conditions for edge computing are typically vastly different than those at centralized data centers. Edge computing sites might, at best, consist of little more than a telecommunications closet with minimal or no HVAC. Rugged, purpose-built, compact, and accelerated edge servers are therefore ideal for such deployments. The Dell PowerEdge XR7620 server checks all of those boxes. It is a high-performance, high-capacity server for the most demanding workloads, certified to operate in rugged, dusty environments ranging from -5C to 55C (23F to 131F), all within a short-depth 450mm (from ear-to-rack) form factor.
MLPerf Inference workload summary
MLPerf is a multi-faceted benchmark suite that benchmarks different workload types and different processing scenarios. There are five workloads and three processing scenarios. The workloads are:
- Image classification
- Object detection
- Medical image segmentation
- Speech-to-text
- Language processing
The scenarios are single-stream (SS), multi-stream (MS), and Offline.
The tasks are self-explanatory and are listed in the following table below, along with the dataset used, the ML model used, and descriptions. The single-stream tests reported results at the 90th percentile; multi-stream tests reported results at the 99th percentile.
Table 1. MLPerf Inference benchmark scenarios
Scenario | Performance metric | Example use cases |
Single-stream | 90% percentile latency | Google voice search: Waits until the query is asked and returns the search results. |
Offline | Measured throughput | Batch processing aka Offline processing. Google photos identifies pictures, tags people, and generates an album with specific people and locations/events Offline. |
Multi-stream | 99% percentile latency | Example 1: Multicamera monitoring and quick decisions. MultiStream is more like a CCTV backend system that processes multiple real-time streams on identifying suspicious behaviors. Example 2: Self driving cameras merge all multiple camera inputs and make drive decisions in real time. |
Table 2. MLPerf EdgeSuite for inferencing benchmarks
Industry reports about the future of edge computing
According to Forrester’s report (“Five technology elements make workload affinity possible across the four business scenarios”), most systems today are designed to run software in a single place. This creates performance limitations as conditions change, such as when more sensors are installed in a factory, as more people gather for an event, or as cameras receive more video feed. Workload affinity is the concept of using distributed applications to deploy software automatically where it runs best: in a data center, in the cloud, or across a growing set of connected assets. Innovative AI/ML, analytics, IoT, and container solutions enable new applications, deployment options, and software design strategies. In the future, systems will choose where to run software across a spectrum of possible locations, depending on the needs of the moment.
ML/AI inference performance
Table 3. Dell PowerEdge XR7620 key specifications
MLPerf system suite type | Edge |
Operating System | CentOS 8.2.2004 |
CPU | 4th Gen Intel® Xeon® Scalable processors MCC SKU |
Memory | 512GB |
GPU | NVIDIA A2 |
GPU Count | 1 |
Networking | 1x ConnectX-5 IB EDR 100Gb/Sec |
Software Stack | TensorRT 8.4.2 CUDA 11.6 cuDNN 8.4.1 Driver 510.73.08 DALI 0.31.0 |
Figure 1. Dell PowerEdge XR7620: 2U 2S
Table 4. NVIDIA GPUs Tested:
Brand | GPU | GPU memory | Max power consumption | Form factor | 2-way bridge | Recommended workloads |
PCIe Adapter Form Factor | ||||||
NVIDIA | A2 | 16 GB GDDR6 | 60W | SW, HHHL or FHHL | n/a | AI Inferencing, Edge, VDI |
NVIDIA | A30 | 24 GB HBM2 | 165W | DW, FHFL | Y | AI Inferencing, AI Training |
NVIDIA | A100 | 80 GB HBM2e | 300W | DW, FHFL | Y, Y | AI Training, HPC, AI Inferencing |
The edge server offloads the image processing to the GPU. And just as servers have different price/performance levels to suit different requirements, so do GPUs. XR7620 supports up to 2xDW 300W GPUs or 4xSW 150W GPUs, part of the constantly evolving scalability and flexibility offered by the Dell PowerEdge server portfolio. In comparison, the previous gen XR11 could support up to 2xSW GPUs.
Edge server vs data center server comparison[1]
When testing with NVIDIA A100 GPU for the Offline scenario, the Dell XR7620 delivered a performance with less than 1% difference, as compared to the prior generation Dell PowerEdge rack server. The XR7620 edge server with a depth of 430mm is capable of providing similar performance for an AI inferencing scenario as a rack server. See Figure 2.
Figure 2. Rack vs edge server MLPerf Offline performance
XR7620 performance with NVIDIA A2 GPU
XR7620 was also tested with NVIDIA A2 GPU for the entire range of MLPerf workloads in the Offline scenario. For the results, see Figure 3.
Figure 3. XR7620 Offline performance results
XR7620 was also tested with NVIDIA A2 GPU for the entire range of MLPerf workloads in the Single Stream scenario. See Figure 4.
Figure 4. XR7620 Single Stream Performance results
XR7620 was also tested with NVIDIA A30 GPU for the entire range of MLPerf workloads in the Offline Scenario. See Figure 5.
Figure 5. XR7620 Offline Performance results on A30 GPU
XR7620 was also tested with NVIDIA A30 GPU for the entire range of MLPerf workloads in the Single Scenario. See Figure 6.
Figure 6. XR7620 SS Performance results on A30 GPU
In some scenarios, next generation Dell PowerEdge servers showed improvement over previous generations, due to the integration of the latest technologies such as PCIe Gen 5.
Speech to text
The Dell XR7620 delivered better throughput by 16%, as compared to the prior generation Dell server. See Figure 7
Figure 7. Offline Speech to Text performance improvement on XR7620
Image Classification
The Dell XR7620 delivered better latency by 45%, as compared to the prior generation Dell server. See Figure 8.
Figure 8. SS Image Classification performance improvement on XR7620
Conclusion
The Dell XR portfolio continues to provide a streamlined approach for various edge and telecom deployment options based on various use cases. It provides a solution to the challenge of small form factors at the edge with industry-standard rugged certifications (NEBS), with a compact solution for scalability and flexibility in a temperature range of -5 to +55°C. The MLPerf results provide a real-life scenario on edge inferencing for servers on AI inferencing. Based on the results in this document, Dell servers continue to provide a complete solution.
References
Notes:
- Based on testing conducted in Dell Cloud and Emerging Technology lab, January 2023. Results to be submitted to MLPerf in Q2,FY24.
- Unverified MLPerf v2.1 Inference. Results not verified by MLCommons Association. MLPerf name and logo are trademarks of MLCommons Association in the United States and other countries. All rights reserved. Unauthorized use strictly prohibited. See www.mlcommons.org for more information.
[1] Based on testing conducted in Dell Cloud and Emerging Technology lab, January 2023.