Replication groups allow grouping of storage pools from different geographically located VDCs for replication of data between sites. Replication of data across sites has the following advantages:
Similar to storage pools, the minimum number of replication groups should be created. This is because of the indexing overhead associated with storage pool/replication group pairs. There is no reason to have two or more replication groups that do the same thing. That is, for example, two replication groups containing the same set of VDC storage pools are of no value and add additional unnecessary overhead.
The standard scenario is one replication group for local data (non-replicated), and one for replicated data that spans all VDCs. Organizations with more than two sites may consider more replication groups for times when data should only be replicated to a subset of all sites. Generally, one replication which spans all sites is sufficient. Compliance may dictate additional replication groups be created, for example, where data privacy or sovereignty laws prohibit shared data across specific borders.
When three or more sites are in a replication group efficiencies in storage overhead can be gained. ECS can XOR chunks written at two sites at a third site. It is important to understand that in order to gain these efficiencies, new writes must occur at two or more sites. To balance the efficiency across all sites in a replication group, all sites must have relatively similar write workload. This benefit may not be appropriate for all workloads especially in scenarios where WAN latency creates unacceptable bottlenecks. However, there are tradeoffs when spreading data across sites. For instance, there is an additional latency for WAN lookups of objects not local to the VDC. Geo-caching does alleviate some of this; however, this latency can pose some issues for applications if data is not in cache.
Replication group best practices