Home > Storage > PowerMax and VMAX > Storage Admin > Dell PowerMax 2500 and 8500: Data Reduction > CKD compression
As with open systems data, compression of mainframe (CKD) data occurs on the de-stage of writes from cache to the back-end storage. Compression reduces incoming write workloads to the smallest possible size to consume the least amount of capacity possible. Data is compressed when it is passed through data reduction hardware built into the system that uses the GZIP compression algorithm. In a single pass through the data reduction hardware, the data is divided into four sections. These four sections are compressed in parallel to maximize the efficiency of the hardware. The sum of the four sections is the final reduced size of the data that is stored on disk. This result provides granular access for reduced data when there is a partial read or write request. Only the sections that contain the requested data are processed because each section can be handled independently.
The act of compressing data does not generally result in performance loss; however, there can be a negligible performance effect when data needs to be decompressed. Decompression occurs when there are read or write update requests to data stored on the back end in compressed form. A large majority of mainframe workloads exhibit a high read hit percentage, so back-end disk access is dramatically reduced, which further reduces the impact of decompression on the overall workload. For mainframe workloads, decompression overhead is not a significant factor in overall IOPS performance. In addition, activity-based reduction (described in the section Activity-based reduction) avoids any decompression performance impact for backend data that is most frequently accessed.
Activity Based Reduction offsets the negligible performance cost incurred by decompressing data that is accessed frequently. This function allows up to 20% of the busiest data to be stored on the system uncompressed. This result benefits the system as it eliminates the performance impact that results from constantly decompressing frequently accessed data. To determine the busy level of data, the system uses ML algorithms that process statistics collected from incoming I/O to the front-end devices. This action maintains balance between the system resources providing an optimal environment for both data reduction savings and performance.
Data placement is performed using a data placement process called compaction. Compaction intuitively places reduced or unreduced data on disk in the best possible location available. The operation of storing data on disk uses write objects. Each object is 6 MB of contiguous back-end data device capacity across the drives configured in the system. Write objects are aligned on 1 K boundaries and are consumed sequentially in single use. Write objects spread across full stripes for all RAID types supported to optimize writes. Each object supports reduced or un-reduced data. An unreduced write object consists of 108 CKD tracks. A reduced write object consists of 1000 reduced tracks. Reduced entries for write objects range from 1 KB to 52 KB.