Home > Storage > PowerMax and VMAX > Data Protection > Dell EMC PowerMax and VMAX All Flash: TimeFinder SnapVX Local Replication > Replication cache
A portion of the metadata in cache is dedicated for Replication Data Pointers (RDP) which keep track of the snapshot deltas in the SRP. This portion of metadata is called Replication Cache. If Replication Cache is exhausted, snapshots may begin to fail.
Replication Cache is used by SnapVX snapshots and VP Snap sessions. Replication Cache usage increases as SnapVX and VP Snap source devices are written to, as there is more point-in-time data to manage. SnapVX Linked Targets, Clone, and Mirror sessions do not influence Replication Cache usage.
It is important to understand that Replication Cache resources are not immediately released upon snapshot termination. The process may take some time as it is designed as a background process so to not take away processing resources from other higher priority operations within the array. Therefore, current Replication Cache usage should be checked before creating new snapshots immediately after terminating existing snapshots.
All systems follow the same algorithm to determine the percentage of metadata to dedicate to Replication Cache. Systems that have very high snapshot usage may be candidates to have the Replication Cache portion of metadata increased. However, this is not a typical use case. In most systems, the standard algorithm provides enough Replication Cache for the environment. Contact your local Dell EMC Service Representative if you feel your environment requires increased Replication Cache.
The PowerMaxOS 5978 Q3 2019 SR contains enhancements that greatly improve replication cache efficiency.
Solutions Enabler 8.2 and Unisphere 8.2 introduced tools to monitor Replication Cache Usage. These tools are only available when used with systems running the HYPERMAX OS 5977.810.184 or later. The information described below is also available in REST API beginning with version 8.3.
The symcfg list –v output has a field that shows the current Replication Cache usage, as shown in the following:
Symmetrix ID : 00019680XYZ (Local)
Time Zone : Eastern Standard Time
Product Model : VMAX100K
Symmetrix ID : 00019680XYZ
Microcode Version (Number) : 5977 (17590000)
--------------------< TRUNCATED >--------------------------
Max # of DA Write Pending Slots : N/A
Max # of Device Write Pending Slots : 85622
Replication Usage (Percent) : 12
Solutions Enabler provides an alert ID 1222 to report when Replication Cache usage has exceeded the specified thresholds. The default threshold values are the same as the default values for other Solutions Enabler thresholds alerts. The user also can specify other values. However, the alerts are not enabled by default, and need to be enabled by the user. The default threshold values are as follows:
Unisphere 9.0 reports Replication Cache usage on Efficiency tab of the Performance Dashboard:
Alerts are also available in Unisphere. The System Alerts have the following threshold values and are enabled by default:
Users can choose to create custom alerts with their own threshold values:
As stated earlier, snapshots may begin to fail if Replication Cache is exhausted. Solutions Enabler and Unisphere snapshot outputs include a flag to indicate if a snapshot is in a failed state.
Solutions Enabler 9.0 and Unisphere for PowerMax introduced an additional flag to indicate the reason for the failure, which includes RDP resources. The following shows is an example of the symsnapvx structure to list only failed snapshots.
The following is an example of failed snapshots viewed from Unisphere for PowerMax 9.0. Note that the snapshots in the Unisphere for PowerMax example failed due to SRP resources being exhausted, which typically indicates the used capacity of the SRP reached the Reserved Capacity threshold.
C:\>symsnapvx –sid XYZ list -failed
Symmetrix ID : 000197800XYZ (Microcode Version: 5978)
--------------------------------------------------------------
Sym Num Flags
Dev Snapshot Name Gens FLRG TSEB Last Snapshot Timestamp
----- ------------- ---- ---- ---- ------------------------
0003B Snap1 1 R... .... Wed Dec 20 08:30:34 2017
0003C Snap1 1 R... .... Wed Dec 20 08:30:34 2017
0003D Snap1 1 R... .... Wed Dec 20 08:30:34 2017
0003E Snap1 1 R... .... Wed Dec 20 08:30:34 2017
Flags:
(F)ailed : X = General Failure, . = No Failure
: S = SRP Failure, R = RDP Failure, M = Mixed Failure
(L)ink : X = Link Exists, . = No Link Exists
(R)estore : X = Restore Active, . = No Restore Active
(G)CM : X = GCM, . = Non-GCM
(T)ype : Z = zDP snapshot, . = normal snapshot
(S)ecured : X = Secured, . = Not Secured
(E)xpanded : X = Source Device Expanded, . = Source Device Not Expanded
(B)ackground: X = Background define in progress, . = No Background define