Once the dataset is created, the next step is to replicate the dataset to the target platform. The dataset is replicated through either a copy or repeat-copy policy:
- The copy policy is a one-time single copy of the entire dataset to the target platform.
- The repeat-copy policy option is an incremental copy of a dataset that is constantly changing and is only supported for SmartSync file replication. Repeat-copy for file to object copy is not supported at this time. In a repeat-copy policy, the initial run copies the entire dataset; subsequent repeat-copy policy runs only copy the changed data blocks. The new and previous datasets are compared on the source cluster, and only changed blocks are transferred to the target cluster.
Note: Before running a copy or repeat-copy policy, ensure that a dataset creation policy is completed for the source base path. Otherwise, the copy or repeat-copy policy will remain in a paused state.
The steps in this section assume a traditional push replication that is defined and initiated on the source cluster. After reviewing this section, see Cloud copy replication policy
The isi dm policies create command is also used for file-to-object copy replication. However, for file-to-object copy replication, the following fields are applicable:
Pull replication policy to define a pull replication policy.
As an example, run the following copy or repeat-copy policy command:
isi dm policies create [Policy Name] NORMAL true [COPY or REPEAT-COPY] --copy-source-base-path=[Source Dataset Path] --copy-create-dataset-on-target=true --copy-base-base-account-id=[Source DM Account] --copy-base-source-account-id=[Source DM Account] --copy-base-target-account-id=[Target DM Account] --copy-base-target-base-path=[Target directory path] --copy-base-target-dataset-type=FILE --copy-base-dataset-retention-period= --copy-base-dataset-reserve= --copy-base-policy-dataset-expiry-action=DELETE --start-time="YYYY-MM-DD HH:MM:SS"
In this example, some of the options are the same as those of the dataset creation policy. For more information about those options, see Dataset creation policy. You can specify additional options as follows:
- In the copy-create-dataset-on-target field, specify if a new dataset will be created on the target cluster. If this field is set to true, the new dataset is displayed under isi dm datasets list (and isi snapshot list, preceded by an isi_dm_snap) on the target cluster, and the data remains read-only. If the field is set to false, a dataset is not created, but the data is available for read and write.
- In the --copy-base-base-account-id and --copy-base-source-account-id fields, specify the source cluster’s account ID from the isi dm accounts list command.
- In the copy-base-target-account-id field, specify the target platform account ID from the isi dm accounts list command.
- In the copy-base-target-dataset-type field, specify if the target platform is FILE_ON_OBJECT_COPY or FILE. The FILE option is for copying to a PowerScale cluster, while the FILE_ON_OBJECT_COPY option is for GCP, AWS, ECS, or Azure. Object-copy is in a copy format for a file system on object-store. For more information about file to object copy, see the section Cloud copy replication policy.
- Optionally, use the copy-source-sub-paths option to select only a subset of the directory structure that was specified under the copy-source-base-path.
- Optionally, specify a dataset ID, which is found under isi dm datasets list, directly for replication using the copy-dataset-id option. If a copy-dataset-ID is not specified, the most current dataset for the subpath is used for the policy by default.
- Optionally, use the recurrence and stagger the copy or repeat-copy policy with the dataset creation recurrence.
- Optionally, use the reconnect flag on a repeat-copy policy to specify that the target platform already has the baseline copy, allowing only the incremental updates to be replicated.
For more replication policy options, see the CLI Administration Guide on Dell Support.
The isi dm policies create command is also used for file-to-object copy replication. However, for file-to-object copy replication, the following fields are applicable:
- The REPEAT-COPY option is not supported for file-to-object copy.
- The --copy-create-dataset-on-target field is set to true to store a copy of the dataset in the S3 bucket.
- In the --copy-base-base-account-id and --copy-base-source-account-id fields, specify the source cluster’s account ID from the isi dm accounts list command.
- In the --copy-base-target-account-id field, specify the S3 bucket’s account ID from the isi dm accounts list command.
- In the --copy-base-target-base-path field, specify a folder where the dataset will be copied. The folder will be created under the S3 bucket. The dataset is copied to the S3 bucket under [s3 bucket]/.datamover/dsr/[copy-base-target-base-path folder]. In the S3 bucket, a folder is created under [s3 bucket]/.datamover/dsr/. The dsr-latest file is metadata associated with the bucket contents.
- In the type field, specify FILE_ON_OBJECT_COPY.
Pull replication policy
PowerScale to PowerScale replication
A pull replication policy is defined and initiated on the target cluster. To perform a pull replication policy from the target cluster, the isi dm policies create command is issued on the target cluster. However, the --copy-base-base-account-id= and --copy-base-source-account-id are updated to the source cluster’s account, found under isi dm accounts list on the target cluster. All other fields in the command remain as previously described.
A pull replication policy is supported for a file-to-object copy. If a dataset is copied to an S3 bucket with a push policy, as explained in the Cloud copy replication policy section, it may be retrieved from the S3 bucket with a pull policy. From the PowerScale cluster, issue the isi dm policies create command but update the --copy-base-base-account-id and --copy-base-source-account-id fields with the S3 bucket account ID from the isi dm accounts list command. Next, update the --copy-base-target-account-id field with the source cluster’s account ID from the isi dm accounts list command.