PowerScale OneFS 8.2 introduces support for more than one SSIP per subnet. In previous releases, only a single SSIP per subnet was supported and resided on the lowest available NodeID, as explained in Where the SmartConnect Service IP runs (pre OneFS 8.2).
The dependence on a single SSIP caused problems during node maintenance, reboots, or interface flaps. The complications are further magnified, considering the lowest available NodeID is usually the node that is rebooted or is scheduled for maintenance first.
The addition of more than a single SSIP provides fault tolerance and a failover mechanism, ensuring the SmartConnect service continues to load balance clients according to the selected policy. In previous releases of OneFS, once the node hosting the SSIP was out of service, or if the interface was flapping, client connections would fail momentarily as the SSIP migrated to a different node.
The number of SSIPs available per subnet depends on the SmartConnect license. SmartConnect Basic allows two SSIPs per subnet while SmartConnect Advanced allows six SSIPs per subnet, as displayed in the following table:
SSIPs per subnet
SmartConnect Multi-SSIP is not an additional layer of load balancing for client connections. Additional SSIPs provide redundancy and reduce failure points in the client connection sequence. Reverting to the original figure explaining the SmartConnect connection sequence, additional connections are added at step 2, as illustrated in the following figure:
At step 2, the site DNS server sends a DNS request to the SSIP and awaits a response in step 3 for a node’s IP address based on the client connection policy. If for any reason, the response in step 3 is not received within the timeout window, the connection times out. The DNS server tries the 2nd SSIP and awaits a response in step 3. After another timeout window, the DNS server continues cycling through subsequent SSIPs, up to the sixth SSIP with SmartConnect Advanced, if a response is not received after a request is sent to each SSIP.
Although the additional SSIPs are in place for failover, the SSIPs configured are active and respond to DNS server requests. The Multi-SSIP configuration is active/passive, where each node hosting an SSIP is independent and ready to respond to DNS server requests, irrespective of the previous SSIP failing. Therefore, SmartConnect continues to function correctly if the DNS server contacted the other SSIPs, providing SSIP fault tolerance. However, as each node hosts an SSIP independent of the other SSIP hosting nodes, it is unaware of the status of the load-balancing policy and starts the load-balancing policy back to the first option. For example, if the SmartConnect load-balancing policy is round robin for a 50-node subnet, assume that the first SSIP has distributed IP addresses for the first ten nodes. If the second SSIP is contacted by the DNS server, it starts distributing node IP addresses at the first option again, in this case, node one, rather than node eleven. The node hosting the SSIP is unaware of the node IP address distributed by the previous SSIP.
Note: As a best practice, do not configure the site DNS server to load balance the SSIPs. Each additional SSIP is only a failover mechanism, providing fault tolerance and SSIP failover. Allow OneFS to perform load balancing through the selected SmartConnect policy, ensuring effective load balancing.
Multi-SSIP is configured from the user interface or the CLI, by specifying a range of IP addresses. The range of IP addresses is applied to between two and six SSIPs per subnet, depending on the SmartConnect license.
To configure Multi-SSIP from the user interface, click Cluster Management > Network Configuration. Next, either select an existing subnet and click Edit, or if under a groupnet, click More > Add subnet and scroll to the SmartConnect service IPs section, as displayed in the following figure:
To configure Multi-SSIP from the CLI, use the --sc-service-addrs option with an IP address range, as displayed in the following command:
isi network subnets modify subnet0 --sc-service- addrs=192.168.25.10-192.168.25.11
Also, the IP address range may be cleared, or additional ranges may be added, using the following commands:
Multi-SSIP is a feature for SSIP failover, providing SSIP fault tolerance. Configure DNS servers for SSIP failover, ensuring the next SSIP is only contacted if the first SSIP connection times out. If the SSIPs are not configured in a failover sequence, the SSIP load-balancing policy resets each time a new SSIP is contacted. The SSIPs do not track the current distribution status of the other SSIPs because they function independently, negating the function of the selected load-balancing policy.
Configuring IP addresses as failover-only addresses is not supported on all DNS servers. To support Multi-SSIP as a failover only option, a DNS server with support for failover addresses is recommended. If a DNS server does not support failover addresses, Multi-SSIP still provides advantages over a single SSIP. However, increasing the number of SSIPs might affect the ability of SmartConnect to load balance.
Note: If the DNS server does not support failover addresses, test Multi-SSIP in a lab environment mimicking the production environment to confirm the impact on SmartConnect load balancing for a specific workflow. Only after confirming workflow impacts in a lab environment should a production cluster be updated.
If the site DNS server supports failover IP addresses, proceed with the configuration in this section.
As explained earlier in DNS delegation best practices, the first SSIP should be created in DNS as an address (A) record, also referred to as a host entry. The additional SSIPs should be configured as DNS A record failover IP addresses. The first IP address should point to the first SSIP, followed by each configured SSIP IP addresses for failover. The additional SSIPs provide redundancy in an active/passive pattern.
If the site DNS server does not support failover IP addresses, proceed with the configuration in this section.
Note: Prior to configuring a DNS server that does not support failover IP addresses, consider the load-balancing status in SmartConnect is independently managed by each SSIP, as explained in SmartConnect Multi-SSIP. The total impact on load-balancing behavior depends on whether the site DNS server has recursion enabled, how many SSIPs are configured, the load-balancing policy, and the workflow. To confirm the impacts in a specific environment test in a lab environment mimicking the production environment, prior to updating a production cluster.
To configure a DNS server for Multi-SSIP that does not support failover IP addresses, create an NS record, and matching A/AAAA record for each SSIP. Most DNS servers us a Round Trip Time (RTT) to decide which nameserver to use. As an example, for OneFS and a BIND DNS server, consider the following configuration:
isi network subnets modify groupnet0.subnet0 --sc-service-name=cluster- ns1.company.com --sc-service-addrs=220.127.116.11-18.104.22.168
isi network pools modify groupnet0.subnet0.pool0 --sc-connect-policy round_robin
cluster-ns1 IN A 22.214.171.124
cluster-ns2 IN A 126.96.36.199
cluster-ns3 IN A 188.8.131.52
@ IN NS cluster-ns1.company.com.
@ IN NS cluster-ns2.company.com.
@ IN NS cluster-ns3.company.com.
This configuration may be adapted to Windows DNS servers or other DNS servers. The issue with Windows DNS server is the forced 1 second TTL, which affects single SSIP configurations also, as noted in Other SmartConnect considerations.
Also, in an environment where the site DNS server does not support failover IP addresses, consider the following information and recommendations:
Within a subnet, up to six SSIPs are available, depending on the SmartConnect license. Prior to OneFS 8.2, the single SSIP was assigned to the lowest Node ID in the specified subnet. Hosting the SSIP on the lowest Node ID created issues because often the lowest Node ID is providing other services and could be the first to reboot in a rolling upgrade.
Multi-SSIP introduces an enhancement to assigning SSIPs. Attaching an SSIP to a node is no longer dependent on the Node ID. OneFS 8.2 creates a file containing SSIP information, the SSIP Resource File. To host an SSIP, a node must hold a lock on this file. All the nodes that are ready to host an SSIP, attempt to lock the SSIP Resource File. The first nodes to get the lock, host the SSIP. The new process ensures the node assignment is based on a lock to nodes within the subnet, avoiding the issues from previous releases. Once the node is offline, or the interface goes down, the SSIP becomes available for lock again, and the next quickest node to capture the lock hosts the SSIP, as illustrated in the following figure. OneFS ensures that SSIPs are as evenly distributed as possible within a subnet, using a feature to limit a single node from hosting multiple SSIPs. In certain scenarios, a node may host more than a single SSIP, depending on the number of nodes and SSIPs in the subnet.
OneFS 8.2 also introduces a new method for handling configuration and group changes. In releases prior to OneFS 8.2, SmartConnect unconditionally stopped and unconfigured the SSIP during a configuration or group change, and then evaluated where it should run, which was frequently the same node. In OneFS 8.2, the SSIP remains in place through configuration and group changes. After the changes, the SSIP moves only if necessary, minimizing failover impacts.
To confirm which of the nodes are hosting SSIPs, use the following commands:
isi_for_array ifconfig | grep <SSIP>
isi_for_array ifconfig | grep “zone 0”