Home > Storage > PowerScale (Isilon) > Product Documentation > Data Efficiency > Dell PowerScale OneFS: Data Reduction and Storage Efficiency > Inline data reduction efficiency
Compression and deduplication can significantly increase the storage efficiency of data. However, the actual space savings will vary depending on the specific attributes of the data itself.
The following table illustrates the relationship between the effective to usable and effective to raw ratios for the F910, F900, F810, F710, F600, F210, F200, H700, H7000, H5600, A300, and A3000 platforms:
The following table provides descriptions for the various OneFS reporting metrics, such as those returned by the ‘isi statistics data-reduction’ command described below. The table attempts, where appropriate, to equate the OneFS nomenclature with more general industry terminology:
Note: The color scheme in this table is used throughout this paper to categorize and distinguish between the various data metrics.
The interrelation of the data capacity metrics described above can be illustrated in a graphical representation.
As we can see, the preprotected physical (usable) value is derived by subtracting the protection overhead from the protected physical (raw) metric. Similarly, the difference in size between preprotected physical (usable) and logical data (effective) is the efficiency savings. If OneFS SmartDedupe is also licensed and running on the cluster, this data reduction savings value will reflect a combination of compression, inline deduplication, and post-process deduplication savings.
OneFS provides these principal reporting methods for obtaining efficiency information with inline data reduction.
The most comprehensive of the data reduction reporting CLI utilities is the ‘isi statistics data-reduction’ command. For example:
The ‘recent writes’ data to the left of the output provides precise statistics for the five-minute period prior to running the command. By contrast, the ‘cluster data reduction’ metrics on the right of the output are slightly less real-time but reflect the overall data and efficiencies across the cluster.
Note: In OneFS 9.1 and earlier, the right-hand column metrics are designated by the ‘Est’ prefix, denoting an estimated value. However, in OneFS 9.2 and later, the ‘logical data’ and ‘preprotected physical’ metrics are now tracked and reported accurately, rather than estimated.
The ratio data in each column is calculated from the values above it. For instance, to calculate the data reduction ratio, the ‘logical data’ (effective) is divided by the ‘preprotected physical’ (usable) value. From the output above, this would be:
6.02 / 2.37 = 1.76 Or a Data Reduction ratio of 2.54:1
Similarly, the ‘efficiency ratio’ is calculated by dividing the ‘logical data’ (effective) by the ‘protected physical’ (raw) value. From the output above, this yields:
6.02 / 3.40= 0.97 Or an Efficiency ratio of 1.77:1
From the OneFS CLI, the ‘isi compression stats’ command provides the option to either view or list compression statistics. When run in ‘view’ mode, the command returns the compression ratio for both compressed and all writes, plus the percentage of incompressible writes, for a prior five-minute (300 seconds) interval. For example:
# isi compression stats view
stats for 300 seconds at: 2021-04-14 15:46:04 (1618429564)
compression ratio for compressed writes: 3.12 : 1
compression ratio for all writes: 3.12 : 1
incompressible data percent: 6.25%
total logical blocks: 784
total physical blocks: 251
writes for which compression was not attempted: 0.00%
Note: If the ‘incompressible data’ percentage is high in a mixed cluster, there is a strong likelihood that the majority of the writes are going to a non-compression pool.
The ‘isi compression stats’ CLI command also accepts the ‘list’ argument, which consolidates a series of recent reports into a list of the compression activity across the file system. For example:
The ‘isi compression stats’ data is used for calculating the right-hand side estimated ‘Cluster Data Reduction’ values in the ‘isi statistics data-reduction’ command described above. It also provides a count of logical and physical blocks and compression ratios, plus the percentage metrics for incompressible and skipped blocks.
The value in the ‘statistic’ column at the left of the table represents the epoch timestamp for each sample. This epoch value can be converted to a human readable form using the ‘date’ CLI command. For example:
# date -d 1618425636
Wed Apr 14 15:47:34 EDT 2021
From the OneFS CLI, the ‘isi dedupe stats’ command provides cluster deduplication data usage and savings statistics, in both logical and physical terms. For example:
# isi dedupe stats
Cluster Physical Size: 86.14T
Cluster Used Size: 3.43T
Logical Size Deduplicated: 4.01T
Logical Saving: 3.65T
Estimated Size Deduplicated: 5.42T
Estimated Physical Saving: 4.93T
In-line dedupe and post-process SmartDedupe both deliver similar results, just at different stages of data ingestion. Since both features use the same core components, the results are combined. As such, the isi dedupe stats output reflects the sum of both inline dedupe and SmartDedupe efficiency. Similarly, the OneFS WebUI’s deduplication savings histogram combines the efficiency savings from both inline dedupe and SmartDedupe.
Note: The deduplication statistics do not include zero block removal savings. Because zero block removal is technically not due to data deduplication, it is tracked separately but included as part of the overall data reduction ratio.
OneFS 8.2.1 and later includes a ‘-O’ logical overlay flag to ‘isi get’ CLI utility for viewing a file’s compression details.
For example:
# isi get –DDO file1
* Size: 167772160
* PhysicalBlocks: 10314
* LogicalSize: 167772160
PROTECTION GROUPS
lbn0: 6+2/2
2,11,589365248:8192[COMPRESSED]#6
0,0,0:8192[COMPRESSED]#10
2,4,691601408:8192[COMPRESSED]#6
0,0,0:8192[COMPRESSED]#10
Metatree logical blocks:
zero=32 shadow=0 ditto=0 prealloc=0 block=0 compressed=64000
The logical overlay information is described under the ‘protection groups’ output. This example shows a compressed file where the sixteen-block chunk is compressed down to six physical blocks (#6) and ten sparse blocks (#10). Under the ‘Metatree logical blocks’ section, a breakdown of the block types and their respective quantities in the file is displayed - including a count of compressed blocks.
When compression has occurred, the ‘df’ CLI command will report a reduction in used disk space and an increase in available space. The ‘du’ CLI command will also report less disk space used. A file that for whatever reason cannot be compressed will be reported as such:
4,6,900382720:8192[INCOMPRESSIBLE]#1
OneFS 9.2 and later releases use inode version 8, which includes a couple of additional inode delta attributes for storing data reduction metrics. These new attributes are displayed by the ‘isi get -D’ CLI command, and report a file’s physical data blocks, compressed size, and protection blocks. For example:
In OneFS 8.2.1 and later, OneFS SmartQuotas has been enhanced to report the capacity saving from inline data reduction as a storage efficiency ratio. SmartQuotas reports efficiency as a ratio across the desired data set as specified in the quota path field. The efficiency ratio is for the full quota directory and its contents, including any overhead, and reflects the net efficiency of compression and deduplication. On a cluster with licensed and configured SmartQuotas, this efficiency ratio can be easily viewed from the WebUI by browsing to ‘File System > SmartQuotas > Quotas and Usage’. In OneFS 9.2 and later, in addition to the storage efficiency ratio, the data reduction ratio is also displayed.
Similarly, the same data can be accessed from the OneFS command line by the ‘isi quota quotas list’ CLI command. For example:
Example output from the ‘isi quota quotas list’ CLI command. More detail, including both the physical (raw) and logical (effective) data capacities, is also available using the ‘isi quota quotas view <path> <type>’ CLI command. For example:
Example output from the ‘isi quota quotas view’ CLI command. To configure SmartQuotas for inline data efficiency reporting, create a directory quota at the top-level file system directory of interest, for example /ifs. Creating and configuring a directory quota is a simple procedure and can be performed from the WebUI, as follows:
Browse to ‘File System > SmartQuotas > Quotas and Usage’ and select ‘Create a Quota’. In the create pane, field, set the Quota type to ‘Directory quota’, add the preferred top-level path to report on, select ‘File system logical size’ for Quota Accounting, and set the Quota Limits to ‘Track storage without specifying a storage limit’. Finally, select the ‘Create Quota’ button to confirm the configuration and activate the new directory quota.
The efficiency ratio is a single, current-in time efficiency metric that is calculated per quota directory and includes the sum of inline compression, zero block removal, inline dedupe and SmartDedupe. This is in contrast to a history of stats over time, as reported in the ‘isi statistics data-reduction’ CLI command output, described above. As such, the efficiency ratio for the entire quota directory will reflect what is actually there.
Note: The quota directory efficiency ratio and other statistics are not available through the platform API as of OneFS 9.0.
The ‘isi status’ CLI command output includes a ‘Data Reduction’ field:
In OneFS 8.2.1 and later, the OneFS WebUI cluster dashboard now displays a storage efficiency tile, which shows physical and logical space utilization histograms and reports the capacity saving from inline data reduction as a storage efficiency ratio. In OneFS 9.2 and later, a data reduction ratio is also included in the dashboard view. This cluster status view is displayed by default upon opening the OneFS WebUI in a browser and can be easily accessed by browsing to ‘File System > Dashboard > Cluster Overview’.
Note: All of the above storage efficiency tools are available on any cluster running OneFS 8.2.1 and later. However, the inline compression metrics are only relevant for clusters containing compression node pools.