Home Servers PowerEdge Components Direct from Development: Tech Notes

Full Redundancy vs. Fault Tolerant Redundancy for PowerEdge Server PSUs

Download PDF

Mon, 16 Jan 2023 13:44:19 -0000

Read Time: 0 minutes

Matt Ogle

John Jenne

David Hardy

Summary

Understanding the power supply redundancy options to facilitate your server is important for users seeking to prioritize certain use cases over others, such as full, consistent performance during fault conditions or higher performance and capabilities during normal operating conditions. This DfD will discuss two PSU redundancy options; Full Redundancy (FR) and Fault Tolerant Redundancy (FTR), and explain when it may be advantageous for a user to adopt one of these solutions over the other.

Introduction

Customers need power redundancy to maintain application uptime. However, few know that there is more than one type of redundancy to consider, and the best option depends on several factors. This DfD will explain two power supply unit (PSU) redundancy options – Full Redundancy (FR) and Fault Tolerance Redundancy (FTR). Dell Technologies now enables customers to select between these at point of sale for select platforms. Understanding these PSU redundancy options is critical as the selection will determine the minimum PSU capacity required to support the targeted PowerEdge server configuration.

FR configurations run at full performance during normal operating conditions and after PSU redundancy loss (if a PSU goes down due to input loss or fault). FR is optimized for consistent performance, thus the minimum PSU capacity allowed will ensure that the platform configurations full performance power requirements can be supported. In summary – PowerEdge users looking to adopt FR gain consistent PSU performance during normal and fault operating conditions, but will require a PSU capacity capable of supporting full performance power requirements.

FTR configurations run at full performance during normal operating conditions, but after PSU redundancy loss, intelligent platform power control loops may dynamically reduce system performance to limit the platform’s power consumption within the capacity of the healthy PSU. FTR is optimized to enable support for richer platform configurations within a target PSU capacity that provides additional performance and capabilities during normal operations. The target PSU capacity is driven by multiple potential factors, such as:

A larger PSU capacity is not available
PSU capacity is right sized for a typical workload for CapEx and/or OpEx savings
Require configuration support within the capacity of PSUs with C14 inlet connector
Require configuration support within the low-line AC (110V) power limits of C14 and C20 inlet connectors
Require PSU efficiency level and/or input type that is only available in limited PSU capacities

To support richer configurations with more perfomance and capability during normal operation, FTR takes advantage of the additional PSU capacity from the redundant PSU during normal operation. However, when the redundant PSU fails, FTR must take away performance to compensate for loss of additional power capacity that enabled the additional perfomance and capability. In summary – PowerEdge users looking to adopt FTR will have richer platform configuration options within a PSU capacity limit , but must assess the potential impact of performance degradation to their workload.

Addressing the Negative Stereotype

Historically, FR has been deemed as the superior PSU redundancy option. Customers viewed FTR concepts as a “trick” to compensate for a design limitation. Dell Technologies was originally opposed to supporting FTR due to the negative stigma associated with it.

Eventually, Dell Technologies added support for FTR to PowerEdge platforms because platform power requirements were increasing faster than PSU technology advancements. FTR was not advertised or marketed despite being an essential technology to support platform configurations that customers wanted. Only limited references were made in technical white papers.

As FTR concepts have become standard within the industry, it is now seen as a minor trade-off for a greater upside – a solution to various modern-day datacenter power challenges that will not require additional PSUs, greater PSU capacity, or a loss in redundancy. As component density and quantity continues to increase with each generation, customers now require more and more power yet still have the same mechanical (limited space) or electrical (power budget) constraints. FTR resolves these challenges by allowing the total load to exceed the capacity of a single PSU during normal operation by utilizing the additional capacity of the redundant PSU, which results in a considerable increase in power standards and peaks during normal operating conditions.

That is what is so ironic about FTR – its “fatal flaw” of throttling has also become its “saving grace”. FR does not allow for performance variations while FTR does, and this creates use cases where users can leverage FTR to support richer configurations without upgrading their PSU infrastructure. Figure 1 illustrates power, performance, and capability during normal operating conditions, while Figure 2 illustrates how power, performance, and capability during a PSU redundancy loss event:

Figure 1 – Example of FR/FTR performance during normal operating conditions

Figure 2 – Example of FR/FTR performance after PSU redundancy loss occurs

User Navigation Example

The latest-generation of PowerEdge servers (15G) support the option to choose Full Redundancy or Fault Tolerant Redundancy via PSU options at point of sale. Users can configure their servers via the sales portal on www.dell.com and have the option to click a step deeper via the Dell Enterprise Infrastructure Planning Tool (EIPT) for more granular guidance, as shown in Figure 3. Reviewing the PSU options in the PSU Guide and workload power details in EIPT will help PowerEdge users fine-tune their PSU configuration.

Gray – PSU capacity options cannot support the platform configuration
White – FTR. PSU capacity options can support the platform configuration, but peformance may be degraded after PSU redundancy loss
Green – FR. Minimum PSU capacity that can support the configuration with full performance during normal and fault operating conditions. Capacities greater than this capacity are also FR

Figure 3 – Dell EIPT tool displaying various power and cost metrics based on configured PowerEdge server

For example, as seen in Figure 3, 2400W is required for FR. FTR enables the configuration to be supported with 1400W, 1100W, or 800W PSUs. If the platform were the R650 instead of the R750, the 2400W would not be an available option because it is the larger 86mm form factor which is not supported in the 1U 650. FTR enables this configuration to be supported when it could not be otherwise.

If the customer required the PSU input voltage to be low line AC (110V), the 1400W and 1100W PSUs would be limited to a 1050W output. The 2400W PSU would be limited to 1400W. Since 2400W is required for FR, this configuration could not be supported with FR. FTR enables this configuration to be supported with low line AC input.

EIPT estimates the typical power consumption with the 2400W PSU for the target workload to be 751W. The Maximum Potential Power (power virus) is estimated to be 1307W. Note, these are input power estimates, thus they are a little higher than the output power estimate and vary based on capacity due the PSU efficiency curves. The 2400W is the FR recommendation over the 1400W despite the worst case 1307W sustained power estimate because there are short duration power transie nts that exceed the 1400W power delivery capability.

FTR enables the customer to optimize CapEx and OpEx by right sizing their PSU capacity. 1400W could be an option to right size and still provide significant capacity to eliminate or minimize any potential performance degradation. With an estimated 751W typical power, the 1100W and 800W would be more aggress PSU right size options that provides the needed power for the user’s workload assuming the workload does not change. If the workload or environment changes AND PSU redundancy is lost, FTR will manage the load increase to avoid unexpected shutdown and potential data loss.

Pros, Cons and Use Cases

Full Redundancy

Pros
- Consistent performance during normal operating conditions and PSU redundancy loss
- No PSU throttling
Cons
- Maximum sustained power is constrained to the specifications of one PSU
- Does not utilize the additional capacity of the redundant PSU during normal operation
Use Cases
- Configurations that meet power requirements with only one PSU
- Workloads that are sensitive to performance variations, such as HPC
- Platforms that do not have mechanical constraints, such as limited space for more PSUs
- Data centers that do not have electrical constraints, such as low-line AC

Fault Tolerant Redundancy

Pros
- Allows for increased sustained perfomance during normal operating conditions
- Utilizes the additional capacity of the redundant PSU during normal operation
- Eliminates cost of purchasing additional or higher capacity PSU
- Does not require giving up PSU redundancy
- Does not require down-grading platform configuration to fit within target PSU capacity
Cons
- Performance may be reduced during PSU redundancy loss
Use Cases
- Configurations that would meet power requirements with the performance increase coming from the redundant PSU
- Platforms that have mechanical constraints, such as limited space for more PSUs
- Data centers that have electrical constraints, such as low-line AC

Conclusion

Dell Technologies supports both Full Redundancy (FR) and Fault Tolerant Redundancy (FTR) options for the latest-generation (15G) of PowerEdge servers. By understanding the pros and cons of each redundancy type, users can optimize their server by upgrading or downgrading their configuration infrastructure based on what type of power redundancy they desire.

Tags:

Your Browser is Out of Date

Full Redundancy vs. Fault Tolerant Redundancy for PowerEdge Server PSUs

Summary

Introduction

Addressing the Negative Stereotype

User Navigation Example

Pros, Cons and Use Cases

Conclusion

Related Documents

Dell PowerEdge Servers: New PSU Layout Delivers Improved Airflow and PCIe Feature Set

Summary

Split Power Supplies

Balanced Airflow

Balanced Airflow Illustration

In Conclusion

Dell Technologies Direct Liquid Cooling Support for New PowerEdge Servers

Summary

Introduction

New PowerEdge Server Support

Direct Liquid Cooling Technology

New Features and Solutions

Leaking Sensing Technology

POD Solution

Figure 3 - Pod solution containing two outer racks with node-level DLC and one middle InRow Cooler

Benefits of Liquid Cooling Implementation

Conclusion