SmartDedupe and tiering

Thank you for your feedback!

OneFS SmartDedupe maximizes the storage efficiency of a cluster by decreasing the amount of physical storage required to house an organization’s data. Efficiency is achieved by scanning the on-disk data for identical blocks and then eliminating the duplicates. This approach is commonly referred to as postprocess, or asynchronous, deduplication. After duplicate blocks are discovered, SmartDedupe moves a single copy of those blocks to a special set of files known as shadow stores. During this process, duplicate blocks are removed from the files themselves, and replaced with pointers to the shadow stores.
OneFS SmartDedupe does not deduplicate files that span SmartPools node pools or tiers, or that have different protection levels set. This feature avoids any performance or protection asymmetry that could occur if portions of a file live on different classes of storage. However, a deduplicated file that is moved to a pool with a different disk pool policy ID retains the shadow references to the shadow store on the original pool. Retaining these references breaks the rule for deduplication across different disk pool policies but is preferable to rehydrating files that are moved.
Further deduplication activity on that file can no longer reference any blocks in the original shadow store. The file must be deduplicated against other files in the same disk pool policy. If the file had not yet been deduplicated, the deduplication index might have knowledge about the file, and still think it is on the original pool. It will be discovered and corrected when a match is made against blocks in the file. Because the moved file has already been deduplicated, the deduplication index knows about the shadow store only. Because the shadow store has not moved, it will not cause problems for further matching. However, if the shadow store (but not both files) is moved as well, a similar situation occurs. The SmartDedupe job will discover it and purge knowledge of the shadow store from the deduplication index.
Note: Further information is available in the OneFS SmartDedupe white paper.

Your Browser is Out of Date

SmartDedupe and tiering

SmartDedupe and tiering