Erasure coding is another capacity-efficient solution for the failure tolerance method and data protection on all-flash VxRail configurations. As an alternative failure tolerance method to the data replication provided by RAID-1 mirroring, erasure codes can provide up to 50 percent more usable capacity than purely conventional RAID-1 mirroring, which drains storage space.
Erasure coding breaks up data into fragments and distributes redundant chunks of data across the system. It introduces redundancy by using data blocks and striping. To explain basically, data blocks are grouped in sets of n, and for each set of n data blocks, a set of p parity blocks exists. Together, these sets of (n + p) blocks make up a stripe. The crux is that any of the n blocks in the (n + p) stripe is enough recover the entire data on the stripe.
In VxRail clusters, the data and parity blocks that belong to a single stripe are placed in different ESXi hosts in a cluster, providing a layer of failure tolerance for each stripe. Stripes don’t follow a one-to-one distribution model. It is not a situation where the set of n data blocks sits on one host, and the parity set sits on another. Rather, the algorithm distributes individual blocks from the parity set among the ESXi hosts.
Erasure coding provides single-parity data protection (RAID-5) that can tolerate one failure (FTT=1) and double-parity data protection (RAID-6) that can tolerate two failures (FTT=2). The diagrams below illustrate the implementations. A single-parity stripe uses three data blocks and one parity block (3+1), and it requires a minimum of four hosts or four fault domains to ensure availability in case one of the hosts or disks fails (as shown below). It represents a 50 percent storage savings over RAID-1 mirroring. Dual parity saves as much as 30 percent capacity over RAID-1. It uses four data blocks plus two parity blocks (4+2) and requires a minimum of six nodes. See the figure below.
Figure 58. RAID-5 (FTT=1) requires a minimum of four nodes and RAID-6 (FTT=2) with 4+2 nodes
Look at the comparison of usable capacity in the figure below. The erasure-code protection method increases the usable capacity up to 50 percent compared to mirroring.
Figure 59. Erasure coding increases usable capacity up to 50 percent
The SPBM policy Failure Tolerance Method (FTM) lets administrators choose between RAID-1 (Mirroring) and RAID-5/6 (Erasure Coding). The FTT policy determines the number of parity blocks written by the erasure code. See the figure below.
Figure 60. FTT policy determines the number of parity blocks written by the erasure code
VxRail implements erasure coding at a very granular level, and it can be applied to VMDKs, making for a nuanced approach. Configurations for VMs with write-intensive workloads–a database log, for instance–can include a mirroring policy, while the data component can include an erasure coding.
Erasure coding saves space but increases backend overhead. Computing parity blocks consumes CPU cycles and adds overhead to the network and disks, as does distributing data slices across multiple hosts. This extra activity can affect latency and overall IOPS throughput.
The rebuild operation also adds overhead. In general, rebuild operations multiply the number of reads and network transfers used for replication. A formula is available here, too. If n refers to the number of blocks in a stripe, then the rebuild operations cost n times that of ordinary replication. For a 3+1 stripe, that means three disk reads and three network transfers for every one of conventional data-replication. The rebuild operation can also be invoked to serve read requests for currently available data.
This additional I/O is the primary reason why only all-flash VxRail configurations use erasure coding. The rationale here is that the flash disks compensate for the extra I/O.