In-line data reduction is supported with the following caveats:
- OneFS 8.2.1 will support 4 to 252 F810 nodes, or 36 chassis, per cluster.
- OneFS 8.2.2 will support 4 to 252 H5600 or F810 nodes per cluster.
- OneFS 9.0 will support 3 to 252 F600 or F200 nodes, or 4 to 252 H5600 or F810 nodes per cluster.
- OneFS 9.2 will support 3 to 252 F900, F600, or F200 nodes, or 4 to 252 H5600 or F810 nodes per cluster.
- OneFS 9.2.1 and later will support 3 to 252 F900, F600, or F200s, or 4 to 252 F810, H700/7000, H5600, or A300/3000 nodes per cluster.
- OneFS 9.7 and later will support 3 to 252 F900, F710, F600, F210, or F200s, or 4 to 252 F810, H700/7000, H5600, or A300/3000 nodes per cluster.
- OneFS 9.8 and later will support 3 to 252 F910, F900, F710, F600, F210, or F200s, or 4 to 252 F810, H700/7000, H5600, or A300/3000 nodes per cluster
- List Bullet 1
- Data reduction savings depend heavily on factors like the data, cluster composition, and protection level.
- Compressed and deduplicated data does not exit the file system compressed or deduplicated in any shape or form.
- Decompression is substantially less expensive than compression.
- In-line data reduction is exclusive to the PowerScale F910, F900, F810, F710, F600, F210, F200, H700/7000, H5600, and A300/3000 nodes and does not require a software license. In-line data reduction will be automatically disabled on any other node pools.
- In OneFS 9.3 and earlier, in-line compression is automatically enabled on supporting nodes, whereas in-line deduplication is disabled. The following command-line syntax will activate in-line deduplication on a compression cluster:
‘isi dedupe inline settings modify –-mode enabled’
- In-line deduplication is enabled by default for new clusters running OneFS 9.4. For earlier OneFS releases, in-line deduplication is disabled by default.
- There is no compatibility or equivalency between F800 and F810 nodes: They cannot share the same node pool and the F800 nodes will not be able to store compressed data.
- There is no OneFS WebUI support for data reduction. Configuration and management are through the CLI only.
- Partial writes to compression chunks may require reading the entire compression chunk first and decompressing it. This is true even if most of the compression chunk is being written.
- Modifications to compression chunks may require rewriting the entire compression chunk even if only a single logical block is changed.
- Some workloads will have data access patterns that exacerbate the above issues and have the potential to cause more writes than if compression was not used.
- Data integrity failures with compressed data will likely mean that corruption does not just affect a single block but instead the entire compression chunk.
- If SmartPools is used on a mixed cluster containing compression nodes, data will only be compressed and/or in-line deduplicated when it physically resides on the compression node pool or pools. If data is tiered to non-compression node pools, it will be uncompressed on the compression nodes before it is moved, so full uncompressed capacity will be required on the compressed pool.
- Post-process SmartDedupe can run in concert with compression and in-line deduplication. It is supported but not widely used. The SmartDedupe job will have to decompress data first to perform deduplication, which is an additional resource expense.
- In-line compression on the F810 uses a dedicated backend network card, so F810 nodes will not support two-way NDMP back end through a Gen6 back-end Ethernet or Fibre Channel controller.
- Even though compressed files are unintelligible when stored on disk, this typically does not satisfy encryption requirements. However, PowerScale nodes are available with self-encrypting drives (SED) for secure data-at-rest compliance.
- InsightIQ is not yet fully integrated with in-line data reduction and will not report compression savings. This will be addressed in a future release.
- As discussed earlier, in-line compression is not free. There is always a trade-off between cluster resource consumption (CPU, memory, disk), the potential for data fragmentation and the benefit of increased space efficiency.
- Since compression extends the capacity of a cluster, it also has the effect of reducing the per-TB compute resource ratio (CPU, memory, I/O).
- Depending on an application’s I/O profile and the effect of In-line data reduction on the data layout, read and write performance and overall space savings can vary considerably.
- OneFS metadata structures (inodes, b-trees) are not compressed.
- Since compression trades cluster performance for storage capacity savings, compression may not be ideally suited for heavily accessed data, or high-performance workloads.
- SmartFlash (L3) caching is not applicable to F-series nodes since they contain exclusively SSD flash media anyway.
- If a heterogeneous cluster contains PowerScale F910, F900, F810, F710, F600, F210, F200, H700/7000, H5600, or A300/3000 chassis plus other nodes that do not support in-line data reduction, data will be uncompressed as it moves between pools. A non-compression node on the cluster can be an initiator for compressed writes to a compression pool and will perform compression in software. However, this may generate significant overhead for lower powered Archive class nodes.
- In-line deduplication will not permit block sharing across different hardware types or node pools to reduce the risk of performance asymmetry.
- In-line deduplication will not share blocks across files with different protection policies applied.
- OneFS metadata is not deduplicated.
- In-line deduplication will not deduplicate the data stored in a snapshot.
- There is no in-line deduplication of CloudPools files.
- In-line deduplication can deduplicate common blocks within the same file and a sequence of consecutive blocks to a single block in a shadow store, resulting in even better data efficiency.
For more information, see the in-line data reduction white paper.