Before adopting new technology in a production IT environment, you can help ensure success through proof-of-concept testing. When setting up our lab environment to perform functional testing based on the stretch clustering reference architecture, our primary goal was to see how Azure Stack HCI would handle VM and volume placement under certain failure conditions. A secondary objective was to observe the impact to a real running application (Dell OpenManage Enterprise) during failover scenarios.
The following figure shows the network architecture of our test lab environment with a high-throughput networking configuration. We had an existing 10 GbE of bandwidth for inter-site communication.
Figure 3. AX740SC101 cluster network architecture
Note these key considerations regarding the lab network architecture:
- The Storage Replica, management, and VM networks in each site were unique Layer 3 (L3) subnets. In Active Directory, we configured two sites—Bangalore (Site 1) and Chennai (Site 2)—based on these IP subnets so that the correct sites appeared in Failover Cluster Manager on configuration of the stretched cluster. No additional manual configuration of the cluster fault domains was required.
- We configured static routes on all hosts in both sites to ensure that the Storage Replica networks could reach each other across sites through the intended QLogic adapters. The following table shows the networks that were needed to communicate across site boundaries.
Table 1. Required networks
Note: The top-of-rack (ToR) switches were not running BGP, so we also configured static routing on the ToRs to facilitate proper inter-site communication.
- Average latency between the two sites was less than 5 milliseconds, required for synchronous replication.
- Cluster nodes could reach a file share witness within the 200-millisecond maximum roundtrip latency requirement.
- The subnets in both sites could reach Active Directory, DNS, and DHCP servers.
- Software-defined networking (SDN) on a multisite cluster is not currently supported and was not used for this testing.
The following figure illustrates the networking topology at the Hyper-V virtual networking layer:
Figure 4. Hyper-V virtual networking configuration
Consider the following key considerations regarding the virtual networking configuration:
- We configured the lab hosts with a single Switch Embedded Teaming (SET) team. The SET team was configured for management and VM traffic.
- We configured intra-site storage traffic and inter-site Storage Replica traffic to use dedicated physical QLogic adapters. We did not create a SET team for these connections.
- We configured RDMA on the intra-site storage connections but not on the inter-site Storage Replica connections. RDMA is not supported for replica traffic over L3 or WAN links.
- We used SMB Multichannel to distribute traffic evenly across both Storage Replica adapter ports. SMB Multichannel facilitates aggregation of network bandwidth and network fault tolerance when multiple paths are available.
- Inter-site Live Migration used the Storage Replica network, and intra-site Live Migration used the storage network.