Direct from Development – PowerEdge MX Secure Chassis Management
Thu, 12 Nov 2020 18:53:31 -0000
|Read Time: 0 minutes
Summary
This blog describes the innovative new security features that are built-into the new Dell EMC MX7000 chassis.
We will cover the secure boot features built into the Management Modules and iDRAC, the ground-up security design incorporating SELinux and least-privilege processes, and our new mechanisms that ensure the security of all management traffic inside the chassis by authenticating and authorizing every component in the chassis, as well as the encryption for all internal management network traffic.
The intent of all of this work is to make a more secure system for customers, one that customers can trust and rely on, and is secured using the best available security techniques against hacking.
Secure Boot
The first principle of security of an embedded management controller is answering the question of what code is running on that management controller. Is the code running on the management controller authentic code from Dell, or has the device been attacked or compromised in any way? The way that we have comprehensively addressed this question for the MX7000 Management Module is our secure boot and chain of trust. Using these techniques, explained below, we can ensure that the Management Module is running unmodified code that has been authenticated by Dell, and that there is no way an attacker has tampered with or replaced any code, either through a supply-chain attack, or through any kind of online attack.
The technique that we use to secure the Management Module is based on the “Chain of Trust” concept. In this concept, each stage of the boot process uses digital signatures to cryptographically verify that the next stage of boot is signed properly before jumping to the next stage.
The beginning of this chain of trust starts in the factory, when the iDRACs and Management Modules that make up the MX7000 are being built. Our hardware is programed with keys, fused to the device, that allow the processor to verify the bootloader prior to starting, the bootloader in turn has its own keys to verify the verify the kernel. Once booted the Kernel runs on a read only file system, further preventing tampering. Each Management Module and iDRAC is also programmed with device unique Identity certificates, which are a public/private keypair used by the device to identify itself as authentic Dell to others. These are signed by the Dell Certificate Authority, are unique to the device and stored encrypted by the devices Hardware Root Key, described below. So each layer of the system verifies that the next is authentic and has not been modified, creating a complete chain of trust from hardware to running code.
One critical part of the secure boot process is the presence of a unique-per- machine Hardware Root Key (HRK). This symmetric encryption key is is physically fused into the microprocessor during manufacturing. This HRK is never visible or extractable from the OS or applications running on the management controller, however, applications can make cryptographic requests to the hardware crypto accelerator to encrypt or decrypt information using this key. More importantly to our security design, access to this HRK can be disabled at runtime in a manner that cannot be re-enabled without a power cycle. If at any point in the boot process the system detects that it is running non-Dell code, the HRK is disabled. Why this is important will be explained a little bit later.
SELinux and Least Privilege
After you have ensured that the physical flash is secure and running clean, signed Dell EMC firmware images, the next step is to protect the system from online attacks that would allow an attacker to gain access to a running system through some software vulnerability. We have done this through a combination of two techniques that we have integrated into our development process. First, we have adopted SELinux for all of the management controllers in the MX7000: the Management Module as well as the IDRACs. The 1.0 version of MX7000 firmware ships with SELinux fully enabled and set to the highest level of enforcement out of the box, without any configuration requirements for customers to worry about. The second major runtime security initiative that we have delivered is “least privilege”. This security concept enforces each process to run with an individually unique, non-administrative Unix user account. The combination of these two security techniques helps to mitigate any vulnerabilities that might be found: what would formerly have been a major security hole can sometimes be mitigated down to a minor inconvenience.
This combination of SELinux and “least privilege” protects the sensitive areas of the MX7000 iDRAC and Management Modules. Only the processes associated with establishing machine to machine trust have access to the private key information on the device and access to these processes limited. With defined separation of tasks and access between the difference processes, an attacker of the MX7000 would find their ability to modify or control a system extremely limited, and accessing a remote system unlikely.
Machine to Machine Trust
So far, we’ve been building up a single system piece-by-piece in a secure manner, first by encrypting and verifying the boot process, next by ensuring that the running firmware image is protected. Next, we need to think about how we ensure that each component in the chassis can trust the other components and communicate with them in a secure manner.
As noted previously, each Management Module and iDRAC have a unique Identity certificate, signed by a Dell Certificate Authority. These certificates are a key part of the trust establishment between the various iDRAC and Management Modules inside the MX7000. Each system on startup will verify the installed certificates against the Dell EMC Root CA to assure they are valid. Certificates that are corrupted or invalid will be unable to establish machine to machine trust with other devices in the chassis.
Once assembled and powered on the MX7000 Management Modules and iDRACs will automatically start the discovery and machine to machine trust establishment process. All network communication inside the MX7000 is done over private IPv6 VLANs. The addresses in these VLAN’s are stateless and based on the router advertisement from the Enclosure controller. To locate other devices over the 2^128 IPv6 address space, individual devices use mDNS announcements to broadcast their presence. As the iDRAC and Management Modules discover each other they begin the process of establishing machine to machine trust.
A Management Module or iDRAC wishing to access resources on a discovered iDRAC or Management Module will need to prove it is an authentic Dell device to gain access. A ECIES (Elliptic Curve Integrated Encryption Scheme) using a ECDH (Elliptic Curve Diffie-Hellman) key exchange is used to pass the public portion of a “clients” certificate to the discovered “server”. The server will validate the certificate chain of the public certificate against the Dell root CA. If valid, the server will use the public key present in the certificate to form the shared symmetric key. The final piece of the puzzle is giving the server a way to validate that the client actually has access to the private key for the unique identity certificate. To do this, the server uses a technique called “proof-of-possession” that is specified in RFC 7800. The proof- of-possession verification assures the server that the client has both public and private portions of a valid Dell Identity certificate. Having fully vetted the client, the server will provide the client with a temporary JWT (Java Web Token) that the server has signed and the client can use to access the resources of the server.
Encrypting management network traffic
In previous versions of PowerEdge servers, communications between devices with a chassis was expected to be secure due to the ‘physical’ security of a private internal network. A iDRAC blade would automatically become part of the chassis group by being physically inserted into the chassis. Communications to the sled from the CMC were done over telnet, HTTP, and other unencrypted channels. The authentication between the processes on the devices was often common shared passwords or other such preprogrammed credentials. While this was an extremely fast and robust design, Dell EMC has evaluated these processes and identified security concerns, concerns that have been addressed in the MX7000.
No longer are communications made over “clear text” HTTP connections, the new Redfish interface used in the chassis is done completely over HTTPS. The encryption of the REST information prevents packet snooping by other devices on the network. Previous multitenant chassis could be compromised by a malicious user, attempting to steal information from others on the same chassis. With the encrypted HTTPS communications this is no longer possible.
HTTPS is not the only communications path updated on the MX7000, all communications between iDRACs and Management Modules inside the chassis is encrypted. Linux sockets between iDRAC and the Management Module are encrypted using ECC (Elliptic Curve Cryptography). Communications over network sockets is possible only after iDRAC and Enclosure Controller have established bidirectional machine to machine trust. Only when both sides have vetted the other, are the connections established, using the keys transferred during trust establishment. This protects data passed between the devices, preventing snooping, as well as blocking attackers from pretending to be a Dell EMC device and accessing data.
Another issue addressed in the MX7000 is the use of a common default or fixed “hidden” user accounts with passwords programmed into the firmware. A fixed username/password know to the software allowed each device to quickly access and configure others without requiring the user interaction. The pitfalls of a common shared passwords are well documented and to avoid these issues the new MX7000 chassis uses unique, short duration and stateless token authentication. Unlike the normal username and password tokens are not tied to an actual user account on a device. In the MX7000 the iDRAC can issue an admin token to the MSM for reading/changing configuration without effecting user based authentication from its GUI. The MSM does not need a ‘user’ account and the new automated machine to machine trust assures the iDRAC is talking to an authentic MSM. Since the MSM is now a trusted administrator this presents all sorts of new possibilities.
Conclusions
In this blog, we demonstrated how the PowerEdge MX solution has a robust security protocol and architecture. By implementing a secure boot process within the MX7000 we ensure that the system starts running only if the code passes integrity checks. Subsequently runtime security measures ensure that the system remains safe from malicious hacking attempts. And additionally, a validated network security measure ensures that everything in the chassis is a system running trusted code. These enhanced security measures use best-in-class tools to protect customer systems, and we believe that this new chassis represents the most secure chassis management system in the industry.
Related Blog Posts
Direct from Development – PowerEdge MX7000 At the Box Serial Access
Thu, 12 Nov 2020 19:26:21 -0000
|Read Time: 0 minutes
Summary
PowerEdge MX7000 comes with a Management Module that provides chassis management. This technical white paper describes the step by step “at- the-box” serial access feature of the chassis management firmware. A typical use of the serial access feature is for troubleshooting purpose when remote access to the management firmware is not available.
Preparation
What you need?
To prepare for serial access, you need the correct cable for connection. You will need a “micro-USB to USB” cable (Figure-1) long enough to connect your client system to the micro-USB port in the Management Module.
Figure 1 USB to Micro USB Cable
Where to connect?
The micro-USB port (Figure-2) for serial access is in the Management Module located at the rear of the chassis. If you see two Management Modules, look for the module that has the LED under “i” lit.
Figure 2 - Micro USB port to connect to
What you need in the client?
You can use any serial terminal client application of your choice, such as Tera Term or PuTTY.
Windows Client Host
If your client host system is running Windows, the default serial device driver should work. Open the Device Manager (type “devmgmt.msc” from command line) to determine which COM port Windows has created for your serial connection.
If Windows is not able to see the serial COM port or it is present but you are not able to connect, you may have to manually install the device driver. You can get this driver from a 3rd party vendor. Search for “cypress semiconductor usb serial driver download”. Look for the driver download link. After the manual driver installation, you should see the COM port for your connection (example in Figure-3).
Figure 3 – 3rd party serial device driver in Windows
Linux Client Host
If your client host system is running Linux, the device driver to connect to the serial interface should already be installed. There is an extra step however that is required to correctly recognize the Management Module serial device.
The USB serial device is recognized by Linux as a “Thermometer” device and loads the cytherm kernel module. The following steps help to correctly recognize the Management Module serial device.
First, add this entry “blacklist cytherm” to the file “/etc/modprobe.d/blacklist.conf”. This will prevent loading the incorrect driver.
Next, connect the serial cable to the host system. If you have already connected the serial cable, you will need to unload the incorrect driver with the command “sudo rmmod cytherm”. Then re-connect the serial cable to the host system.
If you see “/dev/ttyACM0” then you are ready to connect. The “0” means it is the first serial device discovered.
Serial Console
Serial Console Menu
When a serial connection is established to the Management Module, the serial client application will be presented with the serial console’s main menu (Figure-4). It is populated with the available components to which serial connection can be made. On the upper right corner of the menu, it shows which Management Module you are connected to (the Active or the Standby). When you are finished, you may simply disconnect the cable and exit the serial client application.
The following sections describe each selection in the Main menu.
Figure 4 - Main menu
Chassis manager firmware console
Choosing option (A) from the Main menu takes you to the Chassis Manager firmware console. A serial session will open and a login prompt is displayed.
On successful login, you will have access to the Chassis Manager’s firmware racadm interface. To end the session, the exit sequence is “Ctrl-A Ctrl-X”. If using minicom in Linux, the exit sequence is “Ctrl-A Ctrl-A Ctrl-X”. Upon exit, you will see the Main menu.
I/O module firmware console
Choosing option (B) from the Main menu takes you to the I/O Module Console menu (Figure-5). The menu shows you the available I/O modules that support the serial interface.
Prior to selecting an I/O module, you will have the option to toggle the connection mode to either “binary” or non-binary” using option (B) from the menu. In “binary” mode, the terminal control characters from the client application are passed through the serial session.
Upon selection of an I/O module, a serial session will open and a login prompt is displayed. On successful login, you will have access to the I/O module firmware command line.
Figure 5 - I/O module console menu
To end a non-binary session, the exit sequence is “Ctrl-\”.
To end a binary session requires an extra step. The extra step is to login to the Chassis Manager’s web interface and go to Home > Troubleshoot > Terminate Serial Connection.
Server serial console
Choosing option (C) from the Main menu takes you to the Sled Host Serial Console menu (Figure-6). The menu shows you the available server host in a sled present in the chassis.
Figure 6 - Sled host serial menu
Prior to selecting a server sled, you will have the option to toggle the connection mode to either “binary” or non-binary” using option (B) from the menu. In “binary” mode, the terminal control characters from the client application are passed through the serial session.
Upon selection of a server sled, you will get access to the serial command line interface of the operating system running on the sled.
To end a non-binary session, the exit sequence is “Ctrl-\”. This exit sequence can be configured from the sled’s iDRAC UI.
To end a binary session requires an extra step. The extra step is to login to the Chassis Manager’s web interface and go to Home > Troubleshoot > Terminate Serial Connection.
Server management firmware console
Choosing option (D) from the Main menu takes you to the iDRAC Serial Console menu (Figure-7). The menu shows you the available iDRAC present in the chassis. iDRAC is the systems management firmware for a compute sled.
Figure 7- iDRAC console menu
Direct from Development – PowerEdge MX7000 LED Device Status
Thu, 12 Nov 2020 19:10:27 -0000
|Read Time: 0 minutes
Summary
The MX7000 chassis and modular devices in a MX7000 chassis are equipped with multi- purpose LEDs which can indicate the current health state of the device, provide identification or implement device specific features.
This whitepaper intends to provide a single point of comprehensive status information for LED behaviors on PowerEdge MX7000.
Users want to be able to look at the chassis and deduce its current health state when physically in front of the chassis. Most of the components that are present in the MX7000 chassis are able to display their current health state via LEDs.
Users also want to be able to accurately identify components in a chassis. A useful feature to do this is the Identify function that can be activated from the front panel, or remotely via the OpenManage Enterprise Modular GUI. This can be a very useful feature when you are managing a multi- chassis setup and want to remotely identify a particular device in the pool.
Some devices also implement their own specific LED behavior, for example PowerEdge MX5016s implement an LED feature that indicates mapping state. This document will cover these features.
Management Module LED Behavior
The Management Module (MM) is located at the rear of the chassis (Figure 1) and contains two LEDs: Power LED (Green only) and Status LED/Button (Blue or Amber).
Status LED/Button (Blue or Amber) is on the left and the Power LED (Green only) is on the right as shown by red highlights.
Figure 1: Management Module
The Power and Status LED (color is dependent on status) states are as follows:
Healthy Chassis
MM State | Power LED State | Status LED State |
Active | LED ON (Green) | LED ON (Blue-solid) |
Standby | LED ON (Green) | LED OFF |
Identify (Active) | LED ON (Green) | LED ON (Blue-blinking) |
Faulted Chassis
MM State | Power LED State | Status LED State |
Active | LED ON (Green) | LED ON (Amber-blinking) |
Identify (Active) | LED ON (Green) | LED ON (Blue-blinking) |
(Note: Only active MM will reflect faulted chassis state and provide identification functionality.)
Management Module Hardware Failure
Issue | Power LED State | Status LED State |
MM unable to power on | LED OFF | LED OFF |
MM unable to boot up | LED OFF | LED ON (Amber-solid) |
The Status LED/Button on the rear of the chassis changes to AMBER when any of the Front Panel iconic indicators shows AMBER. When the chassis/MM is in Identify State, the combo Status LED/Button shall always blink BLUE and override any other Status LED state.
IO Module LED Behavior
I/O Modules (IOMs) are inserted in the rear of the chassis and support a two-stacked arrangement of LEDS: Top = AMBER/GREEN, Bottom = BLUE.
Figure 2a – Typical Fab A/B IO Module: Power/Status LED on the top and Identification LED on bottom as shown by red highlights.
Figure 2b – Typical Fab C IO Module: Power/Status LED on the top and Identification LED on bottom as shown by red highlights.
The LEDs support the following functions:
IOM Health | Power/Status LED State | Identification LED State |
Healthy | LED ON (Green) | - |
Faulted | LED ON (Amber) | - |
Identify | - | LED ON (Blue-blinking) |
The green LED behavior can be overridden to indicate fabric mismatch. In case there is a fabric mismatch, green LED will blink for 2.5 seconds and then stay lit.
Sled LED Behavior
The Sleds are inserted in the front of the chassis and contain an LED for Power/Status/Identification via Blue or Amber colors.
Figure 3: Current PowerEdge MX Sled Options
The Power/Status/Identification LED is on the top left highlighted in red.
The Power/Status/Identification (color is dependent on status) LED states for a sled device will be as follows:
Chassis manager firmware console
Sled Health | Power/Status/Identification LED State |
Off | LED_OFF |
Healthy | LED ON (Blue) |
Errors exist (System on/off) | LED ON (Amber-blinking) |
Identify | LED ON (Blue-blinking) |
Failsafe | LED ON (Amber-solid) |
For PowerEdge MX5016s (Figure 3), a cylindrical LED is also available marked with green highlight in the figure. Its behavior is as follows:
Mapping state | Cylinder LED on PowerEdge MX5016s |
Mapped to Compute that is powered ON | LED ON (Blinking) |
Unmapped | LED OFF |
All mapped compute sleds are off | LED OFF |
NOTE: It is unsafe to remove the PowerEdge MX5016s any time the LED is Blinking, as it is has active mappings to compute sleds that are powered on. To remove the PowerEdge MX5016s, either unmap storage from all compute sleds, or power down all compute sleds that are using this storage. See the User Guide for more information.
PSU LED Behavior
The Power Supply Units (PSUs) are inserted in the front of the chassis and utilize four LEDs: 3 on the front (figure below, left) and 1 in the back (figure below, right).
Figure 4 - Front and Rear PSU LEDs
The PSU LED states are as follows:
PSU State | Health LED (Front) | AC Present (Front) | DC Present (Front) | AC Present (Rear) |
Healthy | LED ON (Green) | LED ON | LED ON | LED ON |
Faulted | LED ON (Amber) | - | - | - |
On the front of the PSU, if the AC Present LED is illuminated, then AC is detected and within tolerance. If the DC Present LED is illuminated, then the PSU is supplying DC to the chassis. The AC Present LED on the rear of the chassis, when illuminated, indicates that AC is detected.
FAN LED Behavior
The Fans are inserted in the front and the back of the chassis (Figure 8) and contain one LED: Power/Status LED (Green or Amber).
Figure 6 – Front Fans Power/Status LED
Figure 7 – Rear Fans Power/Status LED
The Power/Status/Identification (color is dependent on status) LED states will be as follows:
Fan Health | Power/Status LED State |
Off | LED_OFF |
Healthy | LED ON (Green) |
Fault | LED ON (Amber-blinking) |
Firmware Update in Progress | LED ON (Green-blinking) |
Conclusion: A thorough understanding of the physical LED status can ensure efficient health status and provide feedback for timely troubleshooting. The PowerEdge MX management module, compute sleds, storage sleds, IO Modules, power supply, and fans, each have LED state indicators that deliver identification on specific features.