For optimal cluster performance, we recommend observing the following inline data reduction best practices. Some of this information may be covered elsewhere in this paper.
- Inline data reduction is supported on F910, F900, F810, F710, F600, F210, F200, H700/7000, H5600, A300/3000 nodepools only. Legacy F800 nodes cannot be upgraded or converted to F810 nodes.
- Run the assessment tool on a subset of the data to be compressed or deduplicated.
- When replicating compressed or deduplicated (or both) data, to avoid running out of space on target, it is important to verify that the logical data size (the amount of storage space saved plus the actual storage space consumed) does not exceed the total available space on the target cluster.
- In general, additional capacity savings may not warrant the overhead of running SmartDedupe on node pools with inline deduplication enabled. See the ‘Performance with Inline Data Reduction’ chapter for additional detail.
- Data reduction can be disabled on a cluster if the overhead of compression and deduplication is considered too high or performance is impacted, or both.
- The software data reduction fall-back option on F810 nodes is less performant, more resource intensive, and less efficient (lower compression ratio) that hardware data reduction. Consider removing F810 nodes with failing offload hardware from the node pool.
- Run the deduplication assessment job on a single root directory at a time. If multiple directory paths are assessed in the same job, you will not be able to determine which directory should be deduplicated.
- Recommend enabling inline deduplication before rebooting the F910, F900, F810, F600, F200, and H5600 nodes in a cluster.