Home > Workload Solutions > Oracle > White Papers > Deploying Dell EMC SRDF/Metro with Oracle Extended RAC > SRDF/Metro Setup
This section shows the steps involved in creating SRDF/Metro protection to an Oracle database without the Smart DR component. The section after, Adding SRDF/Metro Smart DR protection to an existing SRDF/Metro, shows the steps of adding Smart DR to an already configured SRDF/Metro. The section after that, Creating SRDF/Metro and Smart DR protection at the same time, shows the steps of creating new SRDF/Metro protection together with Smart DR all at once.
To answer this question, we need to have a closer look at the storage devices in a PowerMax system. Each device has two SCSI identities: an internal identity which is the original WWN and geometry, and an external identity which can be changed to look like another device. SRDF/Metro changes the external SCSI identity of the remote devices to make them identical to the local devices WWNs. This can only happen once SRDF/Metro is fully in sync, and the paired devices’ data is identical, with a replication state of active/active. Only then can ASM use these devices on the remote storage system since now they will be in a RW (Read Write) state.
In an SRDF environment a minimum of two storage systems are required. SRDF designates the name R1 to the replication source devices and R2 to the replication target devices. Once SRDF/Metro is in active/active state, both sides are equally replicating writes back and forth, so there are no differences between source or target systems, R1 or R2 devices. They are seen as the same device by the host.
If for any reason SRDF/Metro replication stops (planned or unplanned), SRDF will choose a surviving (winning) side, either by the user choice (planned), or by using the Witness rules or Bias rules (unplanned). The devices at the winning side will automatically become or remain R1 devices that are set to RW state and maintain the external SCSI identity used during the active/active replication period. The devices on the side that did not win will become or remain R2, set to WD (Write Disabled state), and their external SCSI identity does not stay the same as it was when they were in the active/active replication period. This is the expected behavior when a cluster is partitioned to avoid any chance of a split-brain situation.
An initial SRDF group is needed to provide the first path for syscalls[1] to the remote system so that all the other higher functions of our management tools can work in the SRDF environment. Steps to set up the SRDF connection are as follows.
A storage group (SG) is a way to group devices together so they can be managed as a single entity. An SG can be stand-alone or in a parent-child relationship. A child SG is just another stand-alone SG. A parent SG contains one or more child SGs. In this way, storage management operations on the stand-alone (or child) SG apply only to that SG, and operations on the parent apply to all its child SGs as a unit. The first step is to determine the SGs to be protected with SRDF/Metro. In the following example grid_sg contains the Oracle +GRID ASM disk group devices (Oracle GI), fra_sg contains the Oracle +FRA ASM disk group devices (archive logs), and db_sg is a parent SG, containing data_sg and redo_sg—Oracle +DATA ASM disk group devices (data files) and Oracle +REDO ASM disk group devices (redo logs).
Create an empty SRDF group with the group label that indicates the SG to be protected. For example, to protect grid_sg, create an empty SRDF group with a Group Label such as “ORACLE_GI”.
Note: the reason we are pre-creating the SRDF group labels and only later selecting them during SRDF/Metro protection step is because when we prepare them in advance, we can select the exact ports we want to use and set the labels to match their proposed usage. Alternatively, it is possible to skip this step and use an ad-hoc SRDF group creation during SRDF/Metro protection step, however, it makes that step more complex.
Although SRDF/Metro can operate without a Witness (using Bias rules), it is highly recommended to configure one or more Witnesses as arbitrators in case of an unplanned cluster partitioning. If the two sides cannot communicate between themselves, the Witness helps SRDF/Metro determine which side to keep alive to avoid a split-brain situation.
There are two choices for Witness configuration: physical or virtual. More than one Witness can be configured, however, only one can be active at a time. In addition, if both physical and virtual Witnesses are configured, the physical takes precedence.
Note: The steps are not shown here, but they are exactly the same as in step 3 where we created an empty SRDF Group, except this time make sure to check the box for “SRDF/Metro Witness Group”. Repeat from both local and remote storage systems.
PowerMax devices are made visible to hosts through masking views. To create a masking view, you need a storage group (SG), a port group (PG) and an initiator group (IG). The SG contains the devices you want to make visible. The PG contains the storage front-end ports that you want to make the devices visible across. And the IG contains the host bus adaptor (HBA) ports’ WWNs also known as initiators. An IG can be “cascade”, meaning that it can have a parent-child relationship. We recommend that in Oracle clustered environments each database server (cluster node) has its own IG. They can then all be aggregated into the parent IG, representing the cluster. The parent IG is the one used to create the masking view. This allows for easy addition or removal of a node in the cluster by simply adding or removing its IG from the parent IG.
The masking views are stored in the storage system containing the devices. As a result, each server that requires access to devices in a storage system needs to be part of a masking view in that system. Once the masking view is created, the devices in the SG are visible across the storage ports in the PG to the hosts’ initiators in the IG. Any change of a component of the masking view automatically propagates throughout the masking view. For example, if we add devices to the SG, they will automatically be made visible to the servers. If we add or remove a child-IG from parent-IG, the appropriate server will either start seeing or stop seeing devices in the storage group of the masking view.
Follow these steps to create masking views:
Note: On Linux servers it is easy to identify the initiators’ WWN by running the following command: cat /sys/class/fc_host/host*/port_name. Enter the WWNs of the initiators in Unisphere for each database server.
Note: If cross-links are used then ALL cluster nodes will need visibility to the ‘remote’ devices. If cross-links are not used, then each database server only has visibility to one of the storage systems – the ‘local’ or ‘remote’. In that case, only the servers connected to the ‘remote’ storage system are included in the ‘remote’ storage masking views.
Note: Since masking views are stored in the storage system, if cross-links are used, to create the masking views of the remote storage system, open the remote system’s Unisphere and perform the operation from there.
Using cross-links increases the overall cluster availability at the cost of added complexity. With cross-links, each cluster node has visibility to the paired devices in both storage systems. For example, if SRDF/Metro replication stops (planned, or unplanned), only one system will “win” and its R1 devices will continue to serve I/Os. The other storage system will turn its devices to R2 and stop servicing I/Os.
Without cross-links, the Oracle instances connected to the storage system with the R2 devices will have to shut down (planned downtime), or crash (unplanned downtime), as no I/Os are allowed and RAC will evict these nodes out of the cluster. However, with cross-links, if the servers survived the “disaster”, ALL cluster nodes will continue to service I/Os from the paths going to the “winning” system’s R1 devices. Perhaps that storage system is further away and reads that previously were serviced from the ‘closer’ system are now a little slower, but all database instances remain up and running, servicing I/Os and users.
Note: Remember that even if cross-links are not used, the database instances connected to the “winning” storage system will continue to service I/Os, and user sessions can automatically reconnect there. Still, some will prefer to not bring down any database instance if possible, even if replication stops or connectivity to one storage system is disrupted.
PowerPath uses the response time threshold of each path to determine which are the local and which are the remote paths. It then places the remote paths (the paths with the higher latency) in an auto-standby mode, so they are not used. Only if there are no valid paths to the local device left, it will re-enable these paths.
Note: Auto-standby mode gives PowerPath an advantage in SRDF/Metro configuration with cross-links, as compared to native multipathing. Only the local paths (with lower latency) will service I/Os and the remote paths over the cross-links will be available just in case all the local paths failed.