Home > Storage > PowerStore > Storage Admin > Dell PowerStore: File Capabilities > Disaster recovery testing
Organizations should orchestrate periodic Disaster Recovery (DR) tests to ensure that their procedures work as expected. This can help minimize the chances of any surprises or unexpected issues in an actual disaster scenario. A comprehensive DR test ensures that the dataset on the replication destination system can be read and written to. It also allows applications to be brought online using the data from the destination system to ensure that there are no errors.
Starting with PowerStoreOS 3.0, a NAS server thin clone can be used for DR testing purposes. The NAS server can be cloned along with one, some, or all its file systems. This is the recommended option to perform a DR test because there is no impact to production and this can be easily configured using PowerStore Manager.
The NAS server can be cloned on either the source or destination system. If the NAS server is cloned on the source system, the administrator can replicate the cloned NAS server and perform a planned failover operation to bring the resources online on the destination system for the DR test. If the NAS server is cloned on the destination system, then no failover is necessary because the cloned resources are already accessible on the destination system.
The NAS server clone creates a NAS server with the same configuration as the destination NAS server. The only settings that are not copied are ones that would cause conflicts, such as the network interfaces and joining the SMB server to the domain. To enable access to the newly cloned NAS server, add a new and unique interface to the clone. Adding an IP address that is in use on either the source or destination is not allowed. If a domain-joined SMB server is needed, enter a new and unique SMB Computer Name, Domain Username, and Password to join it to the domain.
The cloned NAS server and its file systems can be mounted to perform the DR test without any impact to production or replication. Any changes made on the cloned resources and production resources do not impact each other. After the DR test completes, you can delete the cloned resources.
Some organizations may have a requirement to run their DR test using the exact same configuration as production. This enables the ability to run through the DR process and procedures without requiring any changes. This methodology can reduce risk and increase reproducibility in an actual failure scenario. However, re-using IP addresses on the production network creates IP address conflicts. To avoid this, this type of DR test must be run in a network bubble that is completely isolated from the production environment.
Starting with PowerStoreOS 3.6, you can create a DR Test (DRT) NAS server using the CLI or REST API. DRT NAS servers allow creating a NAS server with the same configuration as production, including the ability to use duplicate IP addresses. If a duplicate IP address is not required, the NAS server clone functionality can be leveraged instead.
Before configuring this feature, it is critical to ensure that dedicated network segments for the network bubble are configured at the destination site. The network configuration is done externally to PowerStore and can be configured in many ways, depending on your environment and requirements. This ensures that there is no overlap or interference with production or replication. The contents inside of the network bubble can be an exact clone of the production environment and can have its own set of clients and services.
Creating a DRT NAS server creates a NAS server with the same configuration as the destination NAS server. The only settings that are not copied are the network interfaces and joining the SMB server to the domain. The NAS server interfaces must be configured afterwards because they need to be placed on a different LA or FSN bond than the destination NAS server uses. This ensures that there is no interference between the destination and DRT NAS servers in case a failover is initiated.
A DRT NAS server can be created using one of the following commands:
When the DRT NAS server is created, add a file interface using one of the following commands:
For more information about PSTCLI and REST API commands, see Dell PowerStore CLI Reference Guide and Dell PowerStore REST API Reference Guide on Dell.com/powerstoredocs.
Although the initial configuration of DRT NAS servers can only be done using the PSTCLI or REST API, you can manage these resources using any of the standard tools after creation. DRT NAS servers, file interfaces, file systems, SMB shares, and NFS exports can be modified or deleted using PowerStore Manager, PSTCLI, or REST API. Figure 43 shows a destination NAS server along with a DRT NAS server in PowerStore Manager.
If the production NAS server uses a domain-joined SMB server, the DRT NAS server must be joined separately to the AD domain in the bubble. If the production AD environment is cloned, the AD computer object may already exist, and the domain join operation will fail. In this situation, the computer object must be deleted from the domain in the network bubble or the join operation must be performed using the CLI with the -reuse_computer_account option.
When a DRT NAS server is created, all its file systems, SMB shares, and NFS exports are automatically copied. If there are any unwanted resources that were created as part of the DRT NAS server operation, they can be deleted. Note that snapshots from the source file systems are not copied.
When the DRT NAS server is configured, the DR test can be performed without any impact to production or replication. Any changes made on the DRT resources and production resources do not impact each other. After the DR test completes, you can delete the cloned resources.
Starting with PowerStoreOS 3.0, a planned failover operation can be leveraged to perform a DR test. Because the production NAS server and file systems are being failed over, this operation can impact the production workloads. This option should only be used during a maintenance window.
When the planned failover operation completes, the NAS server and resources are accessible on the destination system and the DR test can be performed. The source system can also be shut down at this time, if desired.
Note that an unplanned failover should never be used for DR testing purposes. An unplanned failover is intended to be used only when the source system is inaccessible and any changes since the last synchronization are lost. An unplanned failover that is initialized from the destination system is not allowed while the source system and production resources are online.
It is also critical to avoid bringing down the replication connection and initiating an unplanned failover operation to perform a DR test. Because the communication path between the systems is unavailable, PowerStore cannot ensure that both NAS servers are in a compatible state. When the connection is restored, PowerStore recognizes that both NAS servers are in production mode, which could result in a split-brain scenario. To prevent split-brain, PowerStore places both NAS servers into maintenance mode to prevent data from being written to both locations. In this situation, engage Dell Technical Support for assistance.
Any changes made to the destination system after the failover are normally preserved. They are replicated back to the original source when a normal reprotect operation is initiated on the NAS server. After the NAS server is reprotected, another planned failover operation can be initiated to bring the resources online on the original source system.
If the administrator does not want to preserve the changes made to the destination system after the failover, they can use the discard_changes_after_failover option when running the reprotect command. Starting with PowerStoreOS 4.0, this option is available when issuing a reprotect using PSTCLI or REST API on asynchronous replication sessions.
This feature enables the ability to make changes during the DR test without making them persistent. It alleviates the need to manually revert changes by changing the configuration and restoring from a snapshot. The benefits of this include minimizing the chances of errors and reducing the time and effort needed to bring production back online after the DR test is complete.
The reprotect with discard changes is a single command that reverses the changes that were made for the DR test. It discards changes made after the failover, fails back to the original source system, and reprotects the replication session in the original direction. Unlike a standard reprotect, replication never reverses direction. This restores the environment to the original production configuration that was in place before the failover. This should be run only after the DR test is complete.
The reprotect with discard changes operation cannot be run if the replication session is already reprotected. If the administrator plans to use the reprotect with discard changes option, they must issue a normal “failover to destination” and not use “failover to destination and reprotect”.
When using the reprotect with discard changes command, all of the following changes are discarded:
Note that some changes are not discarded when using this feature. This includes:
To run the reprotect with discard changes command:
For more information about PSTCLI and REST API commands, see Dell PowerStore CLI Reference Guide and Dell PowerStore REST API Reference Guide on Dell.com/powerstoredocs.