This section includes considerations and best practices for configuring PowerScale CloudPools.
CloudPools settings
CloudPools settings can be changed either on the CloudPools setting tab or on a per-file-pool policy from the OneFS WebUI. It is highly recommended to change these settings on a per-file-pool policy. The following list includes general considerations and best practices for CloudPools settings.
- Encryption: Encryption is an option that can be enabled either on the PowerScale cluster or on AWS. The recommendation is to enable encryption on the PowerScale cluster instead of on AWS. If the average CPU is high (greater than 70%) on the PowerScale cluster, the encryption can be enabled on AWS instead of on the PowerScale cluster. Encryption adds an additional load on the PowerScale cluster. Encryption can also impact the CloudPools archive and recall performance. For more information about protecting data using encryption on AWS, see the AWS documentation.
- Compression: Compression is an option that can be enabled on the PowerScale cluster, in which file data is compressed before sending it to AWS. If network bandwidth is a concern, the recommendation is to enable compression on the PowerScale cluster to save network resources. Compression adds an additional load on the PowerScale cluster which means it might take more time to archive files from PowerScale storage to AWS.
- Data retention: The recommendation is to explicitly set the data retention for the file data being archived from the PowerScale cluster to AWS. If the SmartLink files are backed up with SyncIQ or NDMP, the data retention defines how long the cloud objects remain on AWS. When the retention period has passed, the PowerScale cluster sends a delete command to AWS. AWS marks the associated cloud objects for deletion. The delete process is asynchronous and the space is not reclaimed until garbage collection completes. This process is a low-priority background process, which may take days to fully reclaim the space, depending on how busy the system is.
- Local data cache: If the storage space is limited on the PowerScale cluster, the recommendation is to set lower values for the Writeback Frequency and Cache Expiration. This option reduces the time to keep file data in the local data cache and frees up storage space sooner on the PowerScale cluster.
File pool policy
File pool policies define what data will be archived from the PowerScale cluster to AWS. The considerations are listed below:
- Ensure that the priority of file pool policies is set appropriately. Multiple file pool policies can be created for the same cloud storage account. When the SmartPools job runs, it processes file pool policies in priority order.
- In terms of freeing up storage space on the PowerScale cluster, the recommendation is not to archive small files that are less than 32 KB in size.
- If the files need to be updated frequently, the recommendation is not to archive those files.
- OneFS supports a maximum of 128 file pool policies (SmartPools and CloudPools combined). The recommendation is not to exceed 30 file pool policies per PowerScale cluster.
- If the file pool policy is updated, it will haveno impact on the files already archived. It will only affect the files to be archived when the SmartPools job next runs.
- Archiving based on accessed time, rather than modified or created times, results in files that are used often, including applications, libraries, and scripts. Take care to exclude these types of files from being archived to the cloud, which would result in delays for clients or users loading these applications. One example is when you are archiving user home directories that contain files that are created once but accessed often.
Other considerations
More considerations include the following:
- Deduplication: CloudPools can archive deduped files from a PowerScale cluster to cloud storage. However, un-deduped files will be created when recalling those files from the cloud to the PowerScale cluster. For more information about deduplication within OneFS, see the white paper Next Generation Storage Efficiency with Dell PowerScale SmartDedupe.
- Small file storage efficiency (SFSE): CloudPools and SFSE cannot work together. For PowerScale clusters using CloudPools, any SmartLink files cannot be containerized or packed. It is best practice to not archive small files that will be optimized using SFSE. The efficiencies gained from implementing SFSE for small files outweigh the storage advantages gained from archiving them to the cloud using CloudPools. For more information about the Small File Storage Efficiency feature of OneFS, see the white paper Dell PowerScale OneFS Storage Efficiency.
- Network proxy: When a PowerScale cluster cannot connect to the CloudPool storage target directly, network proxy servers can be configured for an alternate path to connect to the cloud storage.
- SmartConnect: If user access SmartLink files regularly from a specific node, clogging the inline access path may impact client performance. You can configure PowerScale SmartConnect for load-balancing connections for the cluster. For more information about SmartConnect, see the white paper Dell PowerScale Network Design Considerations.
- Cloud storage account: Do not delete a cloud storage account that is in use by archived files. Any attempt to open a SmartLink file associated with a deleted account will fail. In addition, NDMP backup and restore and SyncIQ failover and failback will fail when a cloud storage account has been deleted.
- Cloud objects and data retention: Cloud objects are crucial for SmartLink files. Any attempt to open a SmartLink file associated with deleted cloud objects will fail. OneFS checks data retention and the reference count for cloud objects before garbage collection. When data retention has expired and there is no reference count for cloud objects, cloud objects will be deleted through garbage collection. Data retention is a concept used to determine the Date of Death (DoD) setting for objects that support a SmartLink file. DoD is used to trigger garbage collection only if the reference count is zero for a file on the cluster only. The reference count is a concept used to determine whether cloud objects are associated with SmartLink files, including SmartLink files in the snapshots, SyncIQ backup, and NDMP backup. The considerations include:
- Data retention periods include Cloud data retention period, Incremental backup retention period for NDMP incremental backup and SyncIQ, and Full backup retention period for NDMP only. If more than one period applies to a SmartLink file, the longest period is applied.
- If a SmartLink file is unchanged through multiple SyncIQ backups or NDMP backups, its data retention will remain unchanged.
- Data retention is set or updated on any event that changes the backed-up version of a file or the state of the SmartLink file.
- If a SmartLink file changes and is incrementally backed up, its data retention will be set by calculating the current time plus incremental backup retention period.
- If a SmartLink file is recalled, the reference count will be removed, and its data retention will be set by calculating the current time plus cloud data retention period. Its cloud objects will be deleted through garbage collection after its data retention has expired.
- If a SmartLink file is deleted, its data retention will be set by calculating the current time plus cloud data retention period. If cloud objects are still associated with snapshots, SyncIQ backup, or NDMP backup, its cloud objects will not be deleted through garbage collection after its data retention has expired.
- OneFS upgrade (CloudPools 1.0 to CloudPools 2.0): Before beginning the upgrade, it is recommended to check the OneFS CloudPools upgrade path shown in the following table.
Table 3. OneFS CloudPools upgrade path
8.0.x or 8.1.x | Strongly discouraged | OK if needed but recommend 8.2.2 | Strongly Recommended | Strongly Recommended |
Note: Contact your Dell representative if you plan to upgrade OneFS to 8.2.0.
In a SyncIQ environment with unidirectional replication, the SyncIQ target cluster should be upgraded before the source cluster. The reason is that OneFS allows the CloudPools-1.0-formatted SmartLink files to be converted into CloudPools-2.0-formatted SmartLink files through a post-upgrade SmartLink conversion process. Otherwise, a SyncIQ policy needs to be reconfigured to deep copy, but deep copy will cause archived file content to read from the cloud and be replicated. In a SyncIQ environment with bi-directional replication, it is recommended to disable SyncIQ on both source and target clusters and upgrade both source and target clusters simultaneously. You can then reenable SyncIQ on both source and target clusters when the OneFS upgrades have been committed on both source and target clusters. Depending on the number of SmartLink files on the target DR cluster and the processing power of that cluster, the SmartLink conversion process can take considerable time.
Note: There is no need to stop SyncIQ and Snapshot during the upgrade in a SyncIQ environment with unidirectional replication. Because SyncIQ must resynchronize all converted stub files, it may take SyncIQ some time to catch up with all the changes.
To check the status of the SmartLink upgrade process, run the following command, substituting the appropriate job number.
# isi cloud job view 6
ID: 6
Description: Update SmartLink file formats
Effective State: running
Type: smartlink-upgrade
Operation State: running
Job State: running
Create Time: 2019-08-23T14:20:26
State Change Time: 2019-09-17T09:56:08
Completion Time: -
Job Engine Job: -
Job Engine State: -
Total Files: 21907433
Total Canceled: 0
Total Failed: 61
Total Pending: 318672
Total Staged: 0
Total Processing: 48
Total Succeeded: 21588652
Note: CloudPools recall jobs will not run while SmartLink upgrade or conversion is in progress.
For a Not All Nodes on Network (NANON) cluster, it is recommended to get the unconnected nodes connected to the network before starting the SmartLink conversion. Also, you need to disable SnapDelete until the SmartLink conversion is completed.