This planning guide provides best practices and requirements for using stretched clusters with a VxRail appliance. This guide assumes that the reader is familiar with the vSAN Stretched Cluster Guide. This guide is for use with a VxRail appliance only.
The vSAN stretched cluster feature creates a stretched cluster between two geographically separate sites, so that I/O is synchronous between sites. This feature allows for an entire site failure. It extends the concept of fault domains to data center awareness domains.
The following is a list of the terms that are used for vSAN stretched clusters:
- Preferred or primary site: one of the two data sites configured as a vSAN fault domain.
- Secondary site – one of the two data sites that is configured as a vSAN fault domain.
- Witness host – a dedicated ESXi host or vSAN witness appliance can be used as the witness host. The witness components are stored on the witness host and provide a quorum to prevent a split-brain scenario if the network is lost between the data sites. This is the third fault domain.
The vSAN storage policies that impact stretched clusters are:
- Dual site mirroring (stretched cluster) – enables protection across sites.
- None – keeps data on preferred (stretched cluster) – Keeps data on primary site only, no cross-site protection.
- None – keeps data on non-preferred (stretched cluster) – Keeps data on secondary site only, no cross-site protection.
- Primary Failures to Tolerate (PFTT or FTT) – defines how many disk or node failures can be tolerated for each site. For a stretched cluster, it is ‘2n+1’ (n is the number to tolerate). For erasure coding, it is 4 or 6 (1 or 2 failures respectively). All-Flash is required for erasure coding (RAID 5 or 6).
- Secondary Failures to Tolerate (SFTT) – (stretched cluster) – Defines the number of nodes and disks objects where failure can be tolerated within a site.
- Failure Tolerance Method (FTM) – Defines the RAID selection’s RAID 1 mirroring and RAID 5 or 6, erasure coding.