Inline data reduction considerations

Thank you for your feedback!

In-line data reduction is supported with the following caveats:
- OneFS 8.2.1 and later will support from 4 to 252 F810 nodes, or 36 chassis, per cluster.
- OneFS 9.0 and later will support from 4 to 252 F810 or H5600 nodes, or from 3 to 252 F600 or F200 nodes per cluster.
- OneFS 9.2 and later will support from 4 to 252 F810 or H5600 nodes, or from 3 to 252 F900, F600, or F200 nodes per cluster.
- OneFS 9.2.1 and later will support from 4 to 252 F810, H5600, F700/7000, or A300/3000 nodes, or from 3 to 252 F900, F600, or F200 nodes per cluster.
- OneFS 9.7 and later will support from 4 to 252 F810, H5600, F700/7000, or A300/3000 nodes, or from 3 to 252 F900, F710, F600, F210, or F200 nodes per cluster.
- OneFS 9.8 and later will support from 4 to 252 F810, H5600, F700/7000, or A300/3000 nodes, or from 3 to 252 F910, F900, F710, F600, F210, or F200 nodes per cluster.
- Data reduction savings depend heavily on factors like the data, cluster composition, and protection level.
- Compressed and deduplicated data does not exit the file system compressed or deduplicated in any shape or form.
- Decompression is substantially less expensive than compression.
- Inline data reduction is exclusive to the F910, F900, F810, F710, F600, F210, F200, H700/7000, H5600, and A300/3000 platforms and does not require a software license.
- There is no compatibility or equivalency between F800 and F810 nodes: They cannot share the same node pool and the F800 nodes will not be able to store compressed data.
- There is no OneFS WebUI support for data reduction. Configuration and management are by the CLI only.
- Partial writes to compression chunks may require reading the entire compression chunk first and decompressing it. This is true even if most of the compression chunk is being written.
- Modifications to compression chunks may require rewriting the entire compression chunk even if only a single logical block is changed.
- Some workloads will have data access patterns that exacerbate the above issues and have the potential to cause more writes than if compression was not used.
- Data integrity failures with compressed data will likely mean that corruption does not affect only a single block but instead the entire compression chunk.
- If SmartPools is used on a mixed cluster containing F910, F900, F810, F710, F600, F210, F200, H700/7000, H5600, or A300/3000 nodes, data will only be compressed and/or inline deduplicated when it physically resides on these node pool(s). If data is tiered to non-compression node pools it will be uncompressed before it is moved, so full uncompressed capacity will be required on the compressed pool. Conversely, if SmartPools moves data between compression pools, for example F810 to F200, the data will remain in a compressed state throughout the transfer.
- Post-process SmartDedupe can run in concert with compression and inline deduplication. It is supported but not widely used. The SmartDedupe job will have to decompress data first to perform deduplication, which is an addition resource expense.
- Even though compressed files are unintelligible when stored on disk, this does not satisfy the encryption requirements for secure data at rest compliance. However, PowerScale nodes are available with SED drives.
- InsightIQ is not yet fully integrated with inline data reduction and will not report compression savings. This will be addressed in a future release.
- As discussed earlier, inline compression is not free. There is always trade-off between cluster resource consumption (CPU, memory, disk), the potential for data fragmentation and the benefit of increased space efficiency.
- Because compression extends the capacity of a cluster, it also reduces the per-TB compute resource ratio (CPU, memory, I/O, and so on).
- Depending on an application’s I/O profile and the effect of In-line data reduction on the data layout, read and write performance and overall space savings can vary considerably.
- OneFS metadata structures (inodes, b-trees, and so on) are not compressed.
- Since compression trades cluster performance for storage capacity savings, compression may not be ideally suited for heavily accessed data, or high-performance workloads.
- SmartFlash (L3) caching is not applicable to F810 nodes since they contain exclusively SSD flash media anyway.
- If a heterogeneous cluster contains F810 nodes plus F800 or other non-compression nodes, data will be uncompressed on the fly when it moves between pools. A non-compression node on the cluster can be an initiator for compressed writes to an F810 or H5600 pool and will perform compression in software. However, this may generate significant overhead for lower powered Archive class nodes.
- In-line dedupe will not permit block sharing across different hardware types or node pools to reduce the risk of performance asymmetry.
- In-line dedupe will not share blocks across files with different protection policies applied.
- OneFS metadata is not deduplicated.
- In-line dedupe will not deduplicate the data stored in a snapshot.
- There is no inline deduplication of CloudPools files.
- In-line dedupe can deduplicate common blocks within the same file and a sequence of consecutive blocks to a single block in a shadow store, resulting in even better data efficiency.
- Compression and/or deduplication of data contained within a writable snapshot is not supported in OneFS 9.3 and later.

Your Browser is Out of Date

Inline data reduction considerations

Inline data reduction considerations