Before getting started with the firmware and driver updates, contact either Dell Technologies Support or Microsoft Support to unlock or “break the glass” in the PEP session. Doing so allows monitoring of the update process in real time while each SU node is being placed into Maintenance mode. The basic process for unlocking the PEP session is shown below:
invoke-command -session $session -scriptblock { get-supportsessiontoken }
invoke-command -session $session -scriptblock { unlock-supportsession }
Figure 3. Unlocking the PEP session
Note: We recommend that an Azure Stack Hub Operator continues to have Dell Technologies Support or Microsoft Support monitor the progress after “breaking the glass.”
Note: You must take a valid inventory of the environment to record the “before” and “after” results of the SU node’s firmware update process.
“Invoke-CheckFirmwareBaseline –IPAddresses <single IP address or comma separated list of scale unit node iDRAC IP addresses> –iDRACCredential (Get-Credential) -oemExtensionPath “DUPS” –ManagementServerAddress < OME VM OS IP > –ManagementServerCredential (Get-Credential)”
Note: All characters contained with “< >” are for illustration purposes only. You must replace <xxxx> with those for your environment.
The following figure is a sample output. Items in red are the Current Version and the Available Version. If the Available Version is greater than the Current Version, those versions will need to be remediated. Pay close attention to the -IPAddresses switch, as it will change to match each scale unit node as they are updated.
The firmware updates should be applied in a round-robin or N+1 fashion, starting with SU node #1.
Notes
invoke-command -session $session -scriptblock { (get-clusternode -cluster s-cluster -name sac02-s1-n01).State }
invoke-command -session $session -scriptblock { (get-clusternode -cluster s-cluster -name sac02-s1-n01).StatusInformation }
invoke-command -session $session -scriptblock { (get-clusternode -cluster s-cluster -name sac02-s1-n01).DrainStatus }
Note: It is assumed that these updates are being applied in the production environment. Some update cycles require multiple reboots. After the SU node is in Maintenance mode, load the required files to each iDRAC individually, so that the updates can proceed one node at a time. When one node completes, the Azure Stack Hub Operator resumes that node, waits for the cluster to re-balance that node, then drains the next node for maintenance until all nodes are complete. The firmware updates take about 45 minutes to run on each node. On average, the re-balance process take from 60-120 minutes per node based on the activity and load of the cluster.
The following figure is a sample of the Job Queue on a 13G server.
The following figure is a sample of the Job Queue on a 14G server.
Note: Upload the updates in the order they are listed in the Catalog.xml file.
The following list of firmware and the figure show updates to be applied to a 13G SU node:
Once loaded, the updates are automatically prioritized by criticality.
Note: When all the jobs have been reported as completed, you can select and delete the jobs from the queue.
“Invoke-CheckFirmwareBaseline –IPAddresses <single IP address of current scale unit node’s iDRAC> –iDRACCredential (Get-Credential) -oemExtensionPath “DUPS” –ManagementServerAddress < OME VM OS IP > –ManagementServerCredential (Get-Credential)”
Note: When running the Invoke-CheckFirmwareBaseline command at this step in the update process, do NOT use the “-remediate” parameter.
“Invoke-CheckBIOSSettings –IPAddresses <single IP address of current scale unit node’s iDRAC> –iDRACCredential (Get-Credential)”
Note: If any of the BIOS settings come back with required changes to be applied, run the same command with the “-remediate” parameter. If required, it will automatically reboot the node.
Note: The scripts will error with a message if the node being worked on is not in a powered-down state and the “-remediate” parameter is used.
“Invoke-CheckIDRACSettings –IPAddresses <single IP address of current scale unit node’s iDRAC> –iDRACCredential (Get-Credential)”
Note: The iDRAC settings function automatically validates and remediates, because a change to the iDRAC settings does not reboot the server.
To resume after the updates:
invoke-command -session $session -scriptblock { (get-clusternode -cluster s-cluster -name sac02-s1-n01).State }
invoke-command -session $session -scriptblock { (get-clusternode -cluster s-cluster -name sac02-s1-n01).StatusInformation }
invoke-command -session $session -scriptblock { (get-clusternode -cluster s-cluster -name sac02-s1-n01).DrainStatus }
“invoke-command -session $session -scriptblock { get-storagejob }”
“invoke-command -session $session -scriptblock { get-virtualdisk -cimsession s-cluster | Get-StorageJob }”
Note: When all the Storage Jobs are complete and the cluster has been rebalanced, wait an additional 10-15 minutes before initiating a Drain on the next node.
Note: Ensure that the nodes’ core count and memory are correctly appearing in the Administration Portal before proceeding. As shown in the following figure, sometimes those columns show a zero ‘0’ for the core count or a dash ‘-‘ for the memory on the node that has been resumed. Wait until they display correctly before proceeding.
Note: When all the SU nodes are updated, exit the unlocked or “Broken Glass” PEP session.