These are the prerequisites to a successful SRDF/Metro HA setup:
- Windows Server failover clusters have specific requirements for disk resources, network configuration, and hardware compatibility among other things. It is important to identify the cluster nodes that meet those requirements to be able to pass cluster validation steps highlighted later in this section.
- A Witness component is highly recommended in SRDF/Metro setup. The role of a Witness is to be an arbitrator in case the SRDF/Metro ‘local’ and ‘remote’ storage systems cannot communicate. The Witness helps SRDF/Metro determine which is the “winning” side that should continue to service I/Os. The recommendation is to plan for one or more Witnesses.
- SRDF/Metro has an extensive support matrix. It is highly recommended to ensure the environment is supported before starting a deployment.
- We will set up a two-node Windows Server failover cluster and deploy a SQL Server failover cluster. Additional nodes can be added following a similar procedure.
- It is expected that each Windows Server failover cluster node is zoned and masked to either the local or the remote storage systems, but not both. This increases the solution’s overall simplicity. If using cross-links, refer to the section Using cross-links.
- It is important to ensure that both local and remote storage systems have SRDF ports available and zoned to each other over FC or GigE.
- Devices for Windows failover cluster quorum, SQL Server data and log volumes are created and formatted for the source/R1 side cluster node. Starting with Windows Server 2019, device format sizes of 128KB and higher are also supported. We recommend using at least 128KB format size as the best practice with PowerMax. This allocation unit size aligns with the PowerMax track size of 128KB and provides optimal performance and synergy with other PowerMax data services such as Windows Offloaded Data Transfer (ODX), and PowerMax compression and deduplication.
When is the ‘remote’ environment ready for use?
To answer this question, we need to examine the storage devices in a PowerMax system. Each device has two SCSI identities: an internal identity which is the original World Wide Name (WWN) and geometry, and an external identity which can be changed to look like another device. SRDF/Metro changes the external SCSI identity of the remote devices to make them identical to the local devices WWNs. This can only happen once SRDF/Metro is fully in sync, and the paired devices’ data is identical, with a replication state of active/active. Only then can Windows Server failover cluster use these devices on the remote storage system since now they will be in a RW (Read Write) state.
SRDF naming conventions
In an SRDF environment a minimum of two storage systems are required. SRDF designates the name R1 to the replication source devices and R2 to the replication target devices. Once SRDF/Metro is in active/active state, both sides are equally replicating writes back and forth, so there are no differences between source or target systems, R1 or R2 devices. They are seen as the same device by the host.
When replication stops
If for any reason SRDF/Metro replication stops (planned or unplanned), SRDF will choose a surviving (winning) side, either by the user’s choice (planned), or by using the Witness rules or Bias rules (unplanned). The devices at the winning side will automatically become or remain R1 devices that are set to RW state and maintain the external SCSI identity used during the active/active replication period. The devices on the side that did not win will become or remain R2, set to WD (Write Disabled state), and their external SCSI identity does not stay the same as it was when they were in the active/active replication period. This is the expected behavior when a cluster is partitioned to avoid any chance of a split-brain situation.