Reprotect after a temporary failure | Implementing Dell PowerFlex SRA with VMware Live Site Recovery

The previous section describes the best possible scenario for a smooth reprotection because it follows a planned migration where no errors are encountered. For recovery plans failed over in disaster recovery mode, this may not be the case.

Disaster recovery mode allows for failures ranging from a small to a full site failure of the protection data center. If these failures are temporary and recoverable a fully successful reprotection may be possible once those failures have been rectified. In this case, a reprotection behaves similar to the scenario described in the previous section. If a reprotection is run before the failures are corrected or certain failures cannot be fully recovered, an incomplete reprotection operation occurs. This section describes this scenario.

For reprotect to be available, the following steps must first occur:

A recovery must be run with all the steps finishing successfully. If there are any errors during the recovery, the user needs to resolve the issues that caused the errors and then re-run the recovery.
The original site should be available and the SRM servers at both sites should be in a connected state. If the sites are disconnected, reprotect fails immediately as shown in the figure below. If the original site cannot be restored (for example, if a physical catastrophe destroys the original site) automated reprotection cannot be run and manual recreation is required if and when the original protected site is rebuilt.

This image shows a screen capture of Reprotect fails with sites disconnected. — Figure 104. Reprotect fails with sites disconnected

If the protected site SRM server was disconnected during failover and is reconnected later. The SRM wants to retry certain recovery operations before allowing reprotect. This typically occurs if the recovery plan was not able to connect to the protected side vCenter server and power off the virtual machines due to network connectivity issues. If network connectivity is restored after the recovery plan was failed over. The SRM detects this situation and requires the recovery plan to be re-run in order to power those VMs down.

A reprotection operation fails if it encounters any errors the first time it runs. If so, the reprotect must be run a second time but with the Force cleanup option selected as shown in the figure below.

This image shows a screen capture of Forcing a reprotect operation. — Figure 105. Forcing a reprotect operation

Once the force option is selected, any errors are acknowledged and reported but ignored. This allows the reprotect operation to continue even if the operation has experienced errors. All the possible steps are attempted and completed. Therefore, in certain situations, the PowerFlex replication may not be properly reversed even though the recovery plan and protection groups were. If the Configure Storage to Reverse Direction step fails, manual user intervention with PowerFlex GUI or CLI may be required to complete the process. The user should ensure that:

A source/target swap has occurred by ensuring the replicated target/source devices have changed personalities.
Asynchronous replication has been reestablished.

If a temporary storage failure or replication partition happens, it is likely that manual intervention is required prior to performing a reprotect operation. In this situation the source devices may not have been unmounted.

Your Browser is Out of Date

None

None