Testing and validation

Thank you for your feedback!

The solution went through a testing and validation cycle to ensure the functionality of the deployment. The test cases encompassed general network tests, equipment failure scenarios, and a simulated availability zone failure.

Table 7. Test results

Number	Passed	Scenario	Expected Behavior
1	Yes	Multiple ESXi host failure(s) – Power off	VMware HA restarts the virtual machines on any of the surviving ESXi hosts within the VMware HA Cluster.
2	Yes	Multiple ESXi host failure(s) – Network disconnect	HA continues to exchange cluster heartbeat through the shared datastore. No virtual machine failovers occur.
3	Yes	ESXi host experiences APD (All Paths down) – Encountered when the ESXi host loses access to its storage volumes (in this case, VPLEX Volumes).	In an APD (All Paths Down) scenario, the ESXi host must be rebooted to recover. If the ESXi Server is restarted, this will cause VMware HA to restart the failed virtual machines on other surviving ESXi Servers within the VMware HA cluster.
4	Yes	VPLEX cluster failure (The VPLEX at either site-A or site-B has failed, but ESXi and other LAN/WAN/SAN components are intact.)	The I/O continues to be served on all the volumes on the surviving site.
5	Yes	VPLEX Storage volume is unavailable (for example, it is accidentally removed from the storage view or the ESXi initiators are accidentally removed from the storage view)	VPLEX continues to serve I/O on the other site where the Volume is available.
6	Yes	VPLEX inter-site link failure; vSphere cluster management network intact	VPLEX transitions Distributed Virtual Volumes on the non-preferred site to the I/O failure state. On the preferred site, the Distributed Virtual Volumes continue to provide access.
7	Yes	VPLEX Cluster Witness failure	VPLEX continues to serve I/O at both sites.
8	Yes	VPLEX inter-site WAN link failure and simultaneous Cluster Witness to site-B link failure	The VPLEX fails I/O on the Distributed Virtual Volumes at site-B and continues to serve I/O on site-A.
9	Yes	VPLEX inter-site WAN link failure and simultaneous Cluster Witness to site-A link failure	The VPLEX fails I/O on the Distributed Virtual Volumes at site-A and continues to serve I/O on site-B.
10	Yes	Multi AZ Power down Edge hosts in AZ1	Edge VMs move to AZ2 hosts and are reachable after HA powers them up.
11	Yes	Multi AZ Bring Down ToR 9k switches in AZ1	Traffic should be brought up on ToR switches in AZ2.
12	Yes	Multi AZ Verify BGP Peering in AZ2 after failover	BGP was successful.
13	Yes	Multi AZ Test ECMP functionality in AZ2	ECMP is functioning correctly.
14	Yes	Functional Test Multi AZ Verify East/West Network Connectivity after failover	Ping and traceroutes were successful.
15	Yes	Functional Test Multi AZ Verify North/South Network Connectivity after failover	Ping and traceroute were successful.
16	Yes	Multi AZ Power up Edge hosts in AZ1	Edge VMs should be moved by vSphere HA and powered up on AZ1 hosts
17	Yes	Multi AZ Verify BGP Peering in AZ1 after edge VM's are back online in AZ1	BGP was successful.
18	Yes	Multi AZ BGP Connectivity tests to TOR switches	Routing and BGP connectivity should be up.
19	Yes	Using the CIMC, simultaneously power down all hosts in AZ2	VMs running on the failed site are powered off by vSAN. Non-Stretch VMs will not be restarted. Stretch VMs running in AZ2 will be restarted in AZ1.
20	Yes	Complete site-B failure (The failure includes all ESXi hosts and the VPLEX cluster at site-B.)	VPLEX continues to serve I/O on the surviving site (site-A). When the VPLEX at site-B is restored, the Distributed Virtual Volumes are synchronized automatically from the active site (site-A).
21	Yes	Using the CIMC, simultaneously power down all hosts in AZ1	VMs running on the failed site are powered off by vSAN. Non-Stretch VMs will not be restarted. Stretch VMs running in AZ1 will be restarted in AZ2.
22	Yes	Complete site-A failure (The failure includes all ESXi hosts and the VPLEX cluster at site-A.)	VPLEX continues to serve I/O on the surviving site (site-B). When the VPLEX at the failed site (site-A) is restored, the Distributed Virtual Volumes are synchronized automatically from the active site (site-B).
23	Yes	Using the CIMC power down all hosts in AZ2 and verify must run affinity rules for AZ2	Verify dur-m01-cl01-secondary-az-nonstretch-vms-must-run (srvs 1-4). Verify host rule is enforced with no applicable VMs restarting in AZ1.
24	Yes	Using the CIMC power down all hosts in AZ1 and verify must run affinity rules for AZ1	Verify dur-m01-cl01-primary-az-nonstretch-vms-must-run (srvs 1-4). Verify host rule is enforced with no applicable VMs restarting in AZ2.
25	Yes	On the vSAN Witness Appliance ToR Switches A & B, configure an access list on each switch to block any IP traffic to and from the VSAN Witness Appliance vmk0 address to the AZ1 Management host vmk0 IP addresses. Upon test completion remove the ACL.	Management VMs should continue to run from their host location.
26	Yes	On the vSAN Witness Appliance ToR Switches A & B, configure an access list on each switch to block any IP traffic to and from the VSAN Witness Appliance vmk0 address to the AZ2 Management host vmk0 IP addresses. Upon test completion remove the ACL.	Management VMs should continue to run from their host location.
27	Yes	On the Witness Appliance, verify vmk0 is set to MTU 9000 and vmkping test passes with jumbo frames and DF bit turned on to AZ1 and AZ2 hosts.	Jumbo frame ping test with DF bit set should pass between vSAN Witness vmk0 and AZ1 & AZ2 hosts to the VSAN vmk IP address. VCF VSAN Health Check should pass.
28	Yes	On the Witness Appliance, verify vmk0 is set to MTU 1500 and vmkping test passes with jumbo frames and DF bit turned off to AZ1 and AZ2 hosts.	Jumbo frame ping test with DF bit off should pass between vSAN Witness vmk0 and AZ1 & AZ2 hosts VSAN vmk IP address. VCF VSAN Health Check may fail.
29	Yes	In AZ1, place each host sequentially into maintenance mode using the vSAN data migration option "Ensure accessibility" and verify HA failover and restart for the VMs. Note - Only two hosts can be placed into maintenance mode.	Components on that host will be marked as absent. HA will restart any VMs running on that host.
30	Yes	In AZ2, place a host into maintenance mode using the vSAN data migration option "Ensure accessibility" and verify HA failover and restart for the VMs. Note - Only two hosts can be placed into maintenance mode.	Components on that host will be marked as absent. HA will restart any VMs running on that host.
31	Yes	In AZ1, identify the hosts for NSX Manager A and NSX Manager B. Sequentially place each host into maintenance mode.	vSphere HA will restart one NSX Manager on AZ1 hosts and the second NSX Manager on AZ2 hosts. This will occur due to VM Separation Affinity rules.
32	Yes	In AZ1, using CIMC power down a host and verify HA restarts VMs on another host in AZ1	Components on that host will be marked as absent. HA will restart any VMs running on that host.
33	Yes	In AZ2, using CIMC power down a host and verify HA restarts VMs on another host in AZ2	Components on that host will be marked as absent. HA will restart any VMs running on that host.
34	Yes	Select EM VM which can run in both AZ1 and AZ2. If necessary, vMotion the VM to be sure it is not running on the two hosts to be powered down. Using the CIMC power down both hosts. Verify that the VM is still functioning.	Components on those hosts will be marked as absent. The VM should continue to run on the same host without an HA restart. If necessary, the VM should be able to access object components in AZ2.
35	Yes	Select EM VM which can only run in AZ2. If necessary, vMotion the VM to be sure it is not running on the two hosts to be powered down. Using the CIMC power down both hosts. Verify that the VM is still functioning.	Components on those hosts will be marked as absent. The VM should continue to run on the same host without an HA restart.
36	Yes	General communication from AZ1 to AZ2	Hosts in AZ1 should be able to ping hosts in AZ2.
37	Yes	Datastore Failover between AZ1 and AZ2	Datastore from VPLEX in AZ1 should failover to AZ2 and vice versa.
38	Yes	Shutdown SVI interfaces of the 4th highest priority ToR switch	No impact
39	Yes	Shutdown SVI interfaces of the 3rd highest priority ToR switch	No impact
40	Yes	Shutdown SVI interfaces of the 2nd highest priority ToR switch	No impact
41	Yes	Shutdown SVI interfaces of the highest priority ToR switch	No impact
42	Yes	Reboot the AZ1 ToR B switch to simulate a switch down scenario	No impact
43	Yes	Reboot the AZ1 ToR A switch to simulate a switch down scenario	No impact
44	Yes	Reboot both the AZ1 ToR A & B switch to simulate a switch down scenario	No impact
45	Yes	Reboot the AZ2 ToR B switch to simulate a switch down scenario	No impact
46	Yes	Reboot the AZ2 ToR A switch to simulate a switch down scenario	No impact
47	Yes	Reboot both the AZ2 ToR A & B switch to simulate a switch down scenario	No impact
48	Yes	Reboot the AZ1 Mgmt B switch to simulate a switch down scenario	No impact
49	Yes	Reboot the AZ1 Mgmt A switch to simulate a switch down scenario	No impact
50	Yes	Reboot both the AZ1 Mgmt A & B switch to simulate a switch down scenario	No impact
51	Yes	Test Management cluster anti-affinity rules and host enters maintenance mode	All management hosts should reside on unique hosts, that is, no two NSX Manager appliances should reside on the same AMP host.
52	Yes	Test Management cluster anti-affinity rules and single host failure	HA powers on VM on a different unique host, that is, no two NSX Manager appliances should reside on the same AMP host.
53	Yes	Test Management cluster anti-affinity rules and double host failure	HA powers on VM on a different unique host, that is, no two NSX Manager appliances should reside on the same AMP host.
54	Yes	Test Management Edge VM anti-affinity rules and host enters maintenance mode	All management Edge VM should reside on unique hosts, that is, no two NSX Edge VMs should reside on the same AMP host.
55	Yes	Test management edge VM anti-affinity rules and single host failure	HA powers on VMs. All management hosts should reside on unique hosts, that is, no two MSX edge VMs should reside on the same AMP host.
56	Yes	Test management edge VM anti-affinity rules and double host failure	HA Powers on VMs. All management hosts should reside on unique hosts, that is, no two MSX edge VMs should reside on the same AMP host.
57	Yes	Test Management domain ECMP functionality	ECMP was using Different N/S traffic Paths.
58	Yes	Failure Testing - Management Edge BFD fail over	BFD Failed over in the correct parameters.
59	Yes	Test Management domain BGP functionality	BGP was successful.
60	Yes	General zone communication through Edge	Ping continued with 1 ping packet lost.
61	Yes	Verify the NSX Manager Appliance cluster is fully operational	Cluster was stable.
62	Yes	Down host with active Tier-0 Edge node VM and observe failover	Network connectivity stayed up with no packets lost.
63	Yes	Test Workload ECMP functionality	ECMP is functioning.
64	Yes	Down an active transport node and observe failover	Ping continued with 1 ping packet lost.
65	Yes	Test Workload NSX Manager cluster anti-affinity rules and host enters maintenance mode	NSX Manager node migrated to a different AMP host using DRS.
66	Yes	Test Workload NSX Manager cluster anti-affinity rules and single host failure	NSX Manager node migrated to a different AMP host using DRS.
67	Yes	Test Workload NSX Manager cluster anti-affinity rules and host enters maintenance mode	NSX Manager node migrated to a different AMP host using DRS.
68	Yes	Test Workload NSX Manager cluster anti-affinity rules and single host failure	NSX Manager node migrated to a different AMP host using DRS.
69	Yes	Test Workload NSX Manager cluster anti-affinity rules and double host failure	NSX Manager went disconnected until the host was back online.
70	Yes	Failure Testing - Workload Edge BFD fail over	BFD Failed over in the correct parameters.
71	Yes	Verify ECMP is using multiple paths for N/S traffic flows	ECMP was using Different N/S traffic Paths.
72	Yes	Test Workload Edge VM anti-affinity rules and host enters maintenance mode	NSX Edge VM moved to a different host.
73	Yes	Test workload edge VM anti-affinity rules and single host failure	NSX Edge VM moved to a different host.
74	Yes	Test workload edge VM anti-affinity rules and double host failure	HA moved one NSX Edge VM to an AZ2 Host.
75	Yes	Functional testing Workload VM East/West same segment different host	Ping and traceroute were successful.
76	Yes	Functional testing Workload VM East/West different segment different host	Ping and traceroute were successful.
77	Yes	Functional testing Workload VM East/West same segment same host	Ping and traceroute were successful.
78	Yes	Functional testing Workload VM East/West different segment same host	Ping and traceroute were successful.
79	Yes	Functional testing Workload VM North/South Segment 1	Ping and traceroute were successful.
80	Yes	Functional testing Workload VM North/South Segment 2	Ping and traceroute were successful.
81	Yes	Test Workload domain BGP functionality	BGP was successful.

Your Browser is Out of Date

Testing and validation

Testing and validation