Path management

Thank you for your feedback!

Path failover and load balancing in VMware vSphere environments
The ability to dynamically multipath and load balance through the use of the native or third-party storage vendor multipathing software is available in VMware vSphere. The Pluggable Storage Architecture (PSA) is a modular storage construct that allows storage partners (such as Dell with PowerPath/VE (Virtual Edition)) to write a plug-in to best leverage the unique capabilities of their storage arrays. These modules can interface with the storage array to continuously resolve the best path selection, as well as make use of redundant paths to greatly increase performance and reliability of IO from the ESXi host to storage.
While the use of PowerPath/VE is Dell’s best practice with VMware due to its superior ability to manage traffic and detect anomalies, Dell understands that many companies will choose to use NMP and so both it and PowerPath/VE will be discussed in the next sections.
Native Multipathing plug-ins
By default, the native multipathing plug-in (NMP) supplied by VMware is used to manage IO for non-NVMeoF devices. NMP can be configured to support fixed and round robin (RR) path selection polices (PSP).[1] In addition, Dell supports the use of ALUA (Asymmetrical Logical Unit Access), though only with the Mobility ID.
NMP is not supported for NVMeoF. VMware uses a different plug-in called the High-Performance Plug-in or HPP. This plug-in has been developed specifically for NVMe devices, though it is only the default for NVMeoF devices. For local NVMe devices, NMP is the default, though it can be changed to use HPP through claim rules. HPP only supports ALUA with NVMeoF devices, but unlike NMP, it is unnecessary to create a different claim rule for these devices as HPP is designed for ALUA. To support multipathing, HPP uses the Path Selection Schemes (PSS) when selecting physical paths for IO requests as opposed to the PSP of NMP. HPP supports the following PSS mechanisms:
- Fixed
- LB-RR (Load Balance - Round Robin)
- LB-IOPS (Load Balance - IOPs)
- LB-BYTES (Load Balance - Bytes)
- Load Balance - Latency (LB-Latency)
NMP and other multipathing plug-ins are designed to coexist on an ESXi host; nevertheless, multiple plug-ins cannot manage the same device simultaneously. To address this, VMware created the concept of claim rules. Claim rules are used to assign storage devices to the proper multipathing plug-in (MPP). When an ESXi host boots or performs a rescan, the ESXi host discovers all physical paths to the storage devices visible to the host. Using the claim rules the ESXi host determines which multipathing module will be responsible for managing a specific storage device. Claim rules are numbered. For each physical path, the ESXi host processes the claim rules starting with the lowest number first. The attributes of the physical path are compared with the path specification in the claim rule. If there is a match, the ESXi host assigns the MPP specified in the claim rule to manage the physical path.
This assignment process continues until all physical paths are claimed by an MPP. Figure 44 has a sample claim rules list with only NMP installed.
Figure 44 Default claim rules with the native multipathing module
SATP
VMware uses the concept of Storage Array Type Plug-ins (SATP) which are submodules of NMP. While VMware offers several generic SATPs, each vendor currently has their own SATP in the ESXi code. The SATP for PowerMax arrays is named VMW_SATP_SYMM. When VMware recognizes a device is from one of these arrays, it will be assigned this SATP. Note that each Dell array has their own SATP. Note, however, VMware is moving away from this model and asking vendors to use the generic SATPs.
Claim Rules
If changes to the claim rules are needed after installation, devices can be manually unclaimed and the rules can be reloaded without a reboot as long as IO is not running on the device (for instance, the device can contain a mounted VMFS volume but it cannot contain running virtual machines). It is a best practice to make claim rule changes after installation but before the immediate post-installation reboot. An administrator can choose to modify the claim rules, for instance, in order to have NMP/HPP manage Dell or non-Dell devices. It is important to note that after initial installation of a MPP, claim rules do not go into effect until after the vSphere host is rebooted. For instructions on changing claim rules, consult the PowerPath/VE for VMware vSphere Installation and Administration Guide at https://www.dell.com/support/home/en-us or the VMware vSphere SAN Configuration Guide available from www.VMware.com.
ALUA and the Mobility ID with NMP
As mentioned, the default SATP for PowerMax is VMW_SATP_SYMM. VMware will assign this SATP to any PowerMax device[2] unless different claim rules are added. This can be problematic when utilizing the Mobility ID (MID) for devices.[3] While the default SATP can be used with the MID, it can cause problems in two particular configurations: RDMs for Red Hat GuestOS and SRDF/Metro. In these configurations it is essential to use ALUA. In fact, SRDF/Metro does not support using the VMW_SATP_SYMM SATP with the MID. To assign the other SATP to MID devices, a new claim rule is necessary for VMware’s generic SATP for ALUA, VMW_SATP_ALUA. To add a new rule, issue the following on each ESXi host that will have Mobility ID devices. Notice the use of Round Robin PSP (with best practice IOPS=1 as discussed below), not MRU. The description (-e) can be adjusted:
esxcli storage nmp satp rule add -V EMC -M SYMMETRIX -s VMW_SATP_ALUA -c tpgs_on -P VMW_PSP_RR -O iops=1 -e “VMAX ALUA rule for Mobility IDs”
Reboot the ESXi host to complete the configuration. Any existing and new MID devices will be assigned the ALUA SATP instead of the default.
Note: PowerPath/VE automatically detects the Mobility ID and assigns those devices to its ALUA policy. Therefore, when using this path management software, no changes are required. Only default NMP requires manual intervention on each ESXi host.
HPP Path Selection Schemes (PSS)
The High-Performance Plug-in uses Path Selection Schemes (PSS) to manage multipathing just as NMP uses PSP. As noted above, HPP offers the following PSS options:
- Fixed - Use a specific preferred path
- LB-RR (Load Balance - Round Robin) - this is the default PSS. After 1000 IOPs or 10485760 bytes (whichever comes first), that path is switch in a round robin fashion. This is the equivalent of NMP PSP RR.
- LB-IOPS (Load Balance - IOPs) - When 1000 IOPs are reached (or set number), VMware will switch paths to the one that has the least number of outstanding IOs.
- LB-BYTES (Load Balance - Bytes) - When 10 MB are reached (or set number), VMware will switch paths to the one that has the least number of outstanding bytes.
- Load Balance - Latency (LB-Latency) - this is the same mechanism available with NMP, VMware evaluates the paths and decides which one has the lowest latency.
Because the PSSs LB-IOPS, LB-BYTES, and Load Balance offer intelligence, they are superior PSSs to LB-RR or Fixed. As performance is paramount for NVMeoF, Dell recommends using the Load Balance PSS, or LB-Latency. It offers the best chance at uniform performance across the paths.
To set the PSS on an individual device, issue the following as seen in Figure 45:
esxcli storage hpp device set -P LB-Latency -d eui.04505330303033380000976000019760
Figure 45 Setting PSS LB-Latency on FC-NVMe device
To add a claim rule so that this PSS is used for every NVMeoF device (FC or TCP) on a PowerMax 8000 at reboot issue the following:
esxcli storage core claimrule add -r 914 -t vendor --nvme-controller-model='EMC PowerMax_8000' -P HPP --config-string "pss=LB-Latency"
Note that one cannot pass the usual flag of “model” because that field is restricted to 16 characters and the Dell PowerMax models are 17 characters (e.g., EMC PowerMax_8000). For cases like these, VMware offers the –nvme-controller-model flag, which is used here. Be sure to adjust the model to match the PowerMax array.
Although a reboot is required for current devices, if the new claimrule is loaded by issuing the following, any future devices masked to the host will have the new setting:
esxcli storage core claimrule load
Latency threshold setting
By default, every IO that passes through ESXi, goes through the IO scheduler. It is possible that because of the speed of NVMe, using the scheduler might create internal queuing, thus slowing down the IO. VMware offers the ability to set a latency threshold so that any IO with a response time below the threshold will bypass the scheduler. When this mechanism is enabled, and the IO is below the threshold, the IO passes directly from PSA through the HPP to the device driver.
For the mechanism to work, the observed average IO latency must be lower than the set latency threshold. If the IO latency exceeds the latency threshold, the IO temporarily returns to the IO scheduler. The bypass is resumed when the average IO latency drops below the latency threshold again.
There are a couple different ways to set the latency threshold. To list the existing thresholds, issue:
esxcli storage core device latencythreshold list
To set the latency at the device level issue:
esxcli storage core device latencythreshold set -d eui.36fe0068000009f1000097600bc724c2 -t 10
To set it for all Dell NVMeoF devices on a PowerMax 8000 (adjust model based on array) issue:
esxcli storage core device latencythreshold set -v 'NVMe' -m 'EMC PowerMax_8000' -t 10
These settings will persist across reboot, but any new devices would require latencythreshold to be set on it. Dell is making no specific recommendation around latencythreshold as VMware does not. There has not been any scale testing to date that provides data on the value of this parameter; however, Dell supports the use of it if desired.
Managing NMP in vSphere Client
Claim rules and claiming operations must all be done through the CLI, but the ability to choose the NMP multipathing policy can be performed in the vSphere Client itself. By default, PowerMax devices being managed by NMP and the SATP VMW_SATP_SYMM, will be set to the policy of “Round Robin” which is a best practice. Round Robin uses an automatic path selection rotating through all available paths and enabling the distribution of the load across those paths.
Dell recommends adjusting the frequency of path changes which will be covered next, but the Round Robin policy should not be adjusted. As there are customer use cases that might necessitate changing the policy, however, it is covered here for completeness. There are two methods to change the policy: CLI on the ESXi host, or GUI (vSphere Client or Dell VSI).
The CLI can be executed on each ESXi host by issuing the command:
esxcli storage nmp satp rule add -s VMW_SATP_SYMM -P VMW_PSP_RR -M SYMMETRIX
From then forward after reboot, all PowerMax devices will be, by default, set to Round Robin. Alternatively, each device can be manually changed in the vSphere Client as in Figure 46.
Figure 46 NMP policy selection in vSphere Client
Although a reboot is required for current devices, if the new claimrule is loaded by issuing the following, any future devices masked to the host will have the new setting:
esxcli storage core claimrule load
Managing HPP in vSphere Client
Claim rules and claiming operations must all be done through the CLI, but the ability to choose the HPP multipathing policy for NVMeoF devices can be performed in the vSphere Client itself. By default, PowerMax NVMeoF devices being managed by HPP will have a PSS set to the policy of “LB-RR”. As this is not a best practice, the PSS can be changed to LB-Latency either through CLI or the vSphere Client. Through CLI, for each device execute:
esxcli storage hpp device set -P LB-Latency -d <device_id>
Alternatively, each device can be manually changed in the vSphere Client as in Figure 47.
Figure 47 HPP policy selection in vSphere Client
Dell VSI offers the ability to set the best practices for all NVMeoF devices on a host or cluster under the ESXi Host Settings menu. It is possible to update existing devices as well as add a rule for future ones. The interface for doing this is shown in Figure 48 for the vSphere Client.
Figure 48 VSI HPP Host Recommended Settings - vSphere Client
NMP Round Robin and the IO operation limit
The NMP Round Robin path selection policy has a parameter known as the “IO operation limit” which controls the number of IOs sent down each path before switching to the next path. The default value is 1000, therefore NMP defaults to switching from one path to another after sending 1000 IOs down any given path. Tuning the Round Robin IO operation limit parameter can significantly improve the performance of certain workloads, markedly so in sequential workloads. In case of environments which have random and OLTP type 2 workloads in their environments setting the Round Robin parameter to lower numbers still yields the best throughput.
Dell recommends that the NMP Round Robin IO operation limit parameter be set to 1 for all PowerMax devices.[4] This ensures the best possible performance regardless of the workload being generated from the vSphere environment.
It is important to note that this setting is per device. In addition, setting it on one host will not propagate the change to other hosts in the cluster. Therefore, it must be set for every claimed PowerMax device on every host. For all new, unclaimed devices, however, a global value can be set to ensure the operation limit is set on claiming. This will also hold true if a claimed device is unclaimed and reclaimed.
- To set the IO Operation limit as a rule (to be completed on each ESXi host):
esxcli storage nmp satp rule add -s "VMW_SATP_SYMM" -V "EMC" -M SYMMETRIX" -P "VMW_PSP_RR" -O "iops=1"
To set the limit on claimed devices:
- To check the IO Operation limit on claimed devices:
esxcli storage nmp psp roundrobin deviceconfig get --device=<device NAA>
- To set the IO Operation limit on claimed devices:
esxcli storage nmp psp roundrobin deviceconfig set --device=<device NAA> --iops=1 --type iops
Note that this parameter cannot be altered using the VMware vSphere Client capabilities as this is a CLI-only operation. A reboot is still required to apply the new limit to claimed devices, but to ensure future devices receive the new limit before reboot, issue:
esxcli storage core claimrule load
Dell VSI offers the ability to set the IO Operation limit to 1 for all devices on a host or cluster under the ESXi Host Settings menu. It is also possible to add the rule previously mentioned. The interface for doing this is shown in Figure 49 for the vSphere Client.
Figure 49 VSI NMP Host Recommended Settings - vSphere Client
NMP latency
Beginning with vSphere 6.7 U1 VMware offers a new type of Round Robin NMP called “latency”. The capability enables VMware to test the performance of the paths to a device and route IO appropriately. The feature is known as Latency Round Robin. While Dell still recommends using Round Robin with the IO Operation limit set to 1, it is acceptable to use this latency instead if desired. It is recommended for uniform SRDF/Metro environments. For more detail including implementation steps, refer to the white paper Best Practices for Using Dell SRDF/Metro in a VMware vSphere Metro Storage Cluster.
PowerPath/VE Multipathing Plug-in and management
Dell PowerPath/VE delivers PowerPath multipathing features to optimize VMware vSphere environments. PowerPath/VE uses a command set, called rpowermt, to monitor, manage, and configure PowerPath/VE for vSphere.
The syntax, arguments, and options are very similar to the traditional powermt commands used on all other PowerPath multipathing supported operating system platforms. There is one significant difference in that rpowermt is a remote management tool.
ESXi 6 and higher do not have a service console.[5] In order to manage an ESXi host, customers have the option to use vCenter Server or CLI on a remote server. PowerPath/VE for vSphere uses the rpowermt command line utility for ESXi. The tools that contain the CLI can be installed on a Windows or Linux server or run directly on the PowerPath/VE virtual appliance.
PowerPath/VE for vSphere should not be managed on the ESXi host itself. There is also no option for choosing PowerPath within the “Edit Multipathing Policies” dialog within the vSphere Client as seen in Figure 46. Similarly, if a device is currently managed by PowerPath, no policy options are available for selection as seen in Figure 50. This holds true for any protocol. VMware will default to MRU, but if an attempt is made to change the policy here, nothing will occur.
Figure 50 Manage Paths dialog viewing a device under PowerPath ownership

[1] Dell does not support using the MRU PSP for NMP with PowerMax.

Your Browser is Out of Date

Path management

Path management

Path failover and load balancing in VMware vSphere environments

Native Multipathing plug-ins

SATP

Claim Rules

ALUA and the Mobility ID with NMP

HPP Path Selection Schemes (PSS)

Latency threshold setting

Managing NMP in vSphere Client

Managing HPP in vSphere Client

NMP Round Robin and the IO operation limit

NMP latency

PowerPath/VE Multipathing Plug-in and management