Home > Communication Service Provider Solutions > Telecom Multicloud Foundation > Ironic > Guides > Technical Guide—Managing Dell EMC Hardware with the OpenStack Ironic (Victoria) iDRAC Driver > Firmware management
The iDRAC driver and the Redfish driver now support firmware management. This allows the user to update firmware for the various hardware components of a server to selected versions.
Firmware management has been tested against the following models of Dell EMC servers: PowerEdge R640, PowerEdge XE2420, PowerEdge R6515, and PowerEdge R630xd. Firmware update of the following devices has been tested: iDRAC, BIOS, NIC, PERC H740P RAID controllers, and power supplies. Firmware management should work with any of the firmware updates available on https://support.dell.com that has a format of “Dell Update Packages in native Microsoft Windows 64-bit format”, also known as DUP format.
Detailed instructions on using firmware management are available here:
- https://docs.openstack.org/ironic/victoria/admin/drivers/idrac.html#management-interface
When downloading firmware from https://support.dell.com, select “Windows Server 2016” as the Operating system and “Update Package for MS Windows 64-Bit” as the Format.
As with all Ironic cleaning steps, the update_firmware cleaning step may only be executed against nodes that are in the manageable state in Ironic.
When creating a cleaning step to update the iDRAC firmware, always specify a wait time of 300 seconds for that firmware image. This allows the iDRAC time to restart before Ironic begins using it again. Not specifying this wait time may cause the update_firmware cleaning step to fail.
Rolling back a firmware update consists of updating the firmware to the prior version. As such, the prior version of the firmware must be staged on the web server. Updating to the same version of firmware that a node is already running will cause that firmware to be reinstalled.
Warning:
If a server loses power while it is in the process of updating firmware, devices within the server or the server itself may be rendered inoperable. Be sure to take whatever precautions are necessary to ensure power loss does not occur.
To monitor the progress of an update_firmware cleaning step, you may:
- Execute:
- Login to the iDRAC GUI and view the job queue.
- Login to the iDRAC GUI and bring up the virtual console of the node to watch it as it progresses though the firmware updates.
- For maximum detail, examine the Ironic conductor log. This log contains the Redfish Task GET responses from the iDRAC. Examine the TaskState and TaskStatus properties for detailed task information.
When an update_firmware cleaning step finishes, the state of the node in Ironic will change from clean wait to either manageable or clean failed in the event of a failure.
In the event of a failure, the reason for the failure can be found by running the following command to examine the last_error property on the node in Ironic:
openstack baremetal node show <uuid> | grep last_error
Further detail on an error can generally be found in the Ironic conductor log. Additional information on an error may be found in the job queue or event log of the iDRAC. Examining the virtual console of the node may also provide further detail.
Note: loss of network connectivity to an iDRAC is normal and expected when updating the iDRAC/Lifecycle controller firmware. Connectivity loss is reflected in the Ironic conductor log.
A common error is specifying an incorrect URL in the update_firmware cleaning step. To avoid this error, use a browser, curl, wget, or a similar tool to validate the URLs before executing the update_firmware cleaning step.
Ironic allows a fixed period of time for a single cleaning step to complete. This time is configurable in ironic.conf by changing the value of the clean_callback_timeout setting in the [conductor] group. This setting defaults to 1800 seconds (30 minutes). This means that if the firmware updates in a single cleaning step take longer than 30 minutes, then Ironic will fail the cleaning step. If this happens, any firmware update in progress on the server will continue to completion, and may be successful or may fail. To avoid timeout errors, when doing a number of firmware updates that may take longer than 30 minutes in total to apply, split the updates into separate cleaning steps. Each update will then have 30 minutes to complete instead of a 30-minute timeout for all of the updates together.
Group |
Setting |
Description |
redfish |
firmware_update_status_interval. |
The interval in seconds the BMC is polled for firmware update status Defaults to 60 seconds 0 disables polling |
redfish |
firmware_update_fail_interval |
The interval in seconds between checking for failed firmware updates Defaults to 60 seconds 0 disables polling The associated task cleans up temporary state on the node in Ironic preparing it for further use |
conductor |
clean_callback_timeout |
The amount of time in seconds to allow a cleaning step to run before timing out. Defaults to 1800 seconds. 0 equates to no timeout. |