Elasticsearch data compression with PowerStore

Thank you for your feedback!

Elasticsearch also provides data compression on stored fields which are part of an Elastic index that stores the id of documents and the source data. Since Elasticsearch compresses data before PowerStore, there may not be large savings in data reduction in PowerStore. We do not recommend enabling Elasticsearch data compression in a production environment without first testing it in a nonproduction Elasticsearch environment.
Starting in Elasticsearch 7.10, stored fields are split into blocks that are compressed independently. Compressing data at the block level helps keep fast random access to the data. Compressing all the data at once would require that all data would have to be decompressed at read time. Elasticsearch offers two compression options:
- index.codec: default instructs Elasticsearch to use blocks of 16 kB compressed with LZ4
- index.codec: best_compression Instructs Elasticsearch to use blocks of 60 kB compressed with DEFLATE.
Both options include an algorithm for string deduplication. The algorithm detects for strings that have already occurred earlier in a stream. When a matching string is found, the string is replaced with a reference to the previous occurrence.
With DEFLATE, data are further compressed using Huffman coding.
Elasticsearch 7.10 and beyond further increases the compression algorithm by using a dictionary of strings. If Elasticsearch finds many duplicate strings between the stream of data and the string keys in the dictionary, compression ratios could be better.
For more Elasticsearch data compression information, see Save space and money with improved storage efficiency in Elasticsearch 7.10 | Elastic Blog.

Your Browser is Out of Date

Elasticsearch data compression with PowerStore

Elasticsearch data compression with PowerStore