Home > Workload Solutions > SAP > Guides > SAP HANA TDI Guides > Dell Validated Design for SAP HANA TDI with Dell Isilon All-Flash Scale-Out NAS > Overview
This section applies to multihost SAP HANA scale-out instances and the host autofailover. On failover, the database on the standby host must have read-access and write-access to the files of the failed active host. To avoid the risk of corrupting these files, the failed host must not be able to write to them. This concept is known as fencing.
When using shared file systems such as Isilon NAS storage and NFSv3, you can implement the STONITH method to achieve proper fencing capabilities and ensure that locks are always freed.
In such a setup, use the Storage Connector API to invoke the STONITH calls. During failover, the SAP HANA active host calls the STONITH method of the custom storage connector with the hostname of the failed host as an input value.
A mapping of hostnames to management network addresses is maintained, which is used to send a reboot signal to the server over the management network. When the host comes up again, it automatically starts in standby host role.
The STONITH example in this guide uses the Intelligent Platform Management Interface (IPMI) protocol in bare-metal deployments with Dell PowerEdge servers.
Enable IPMI over LAN
For Dell EMC PowerEdge servers, it is necessary to configure IPMI over LAN for iDRAC to enable IPMI commands over LAN channels to any external systems. Unless IPMI over LAN is configured, external systems cannot communicate with the iDRAC server using IPMI commands.
To configure IPMI over LAN:
The iDRAC Settings Network page appears.
Using a standard naming convention, the /etc/hosts file maintains a mapping of hostnames to IPMI IP addresses to be used in STONITH, as shown in the following figure:
ipmitool power status –H r640-isi01-ipmi -U root -P xxxx
If IPMI is working successfully, Chassis Power is on is returned.
chmod u+s /usr/bin/ipmitool
Create a custom HA/DR STONITH provider
To create your own HA/DR provider, complete the following steps and then add the hook method that you want to use. Our example uses STONITH. For information about the HA/DR provider, see the SAP HANA Administration Guide.
The directory should be within the /hana/shared storage of the SAP HANA installation but outside the <SID> directory structure. Our example uses the /hana/shared/HANA_Hooks location.
For example, you can copy the file and rename it to: /hana/shared/HANA_Hooks/HA_STONITH_Hook.py
Within the HA_STONITH_Hook.py file, we also customized def __init__(), def about and the STONITH hook def stonith, as shown in the following code sample:
"""
Sample for a HA/DR hook provider.
When using your own code in here, please copy this file to location on /hana/shared outside the HANA installation.
This file will be overwritten with each hdbupd call! To configure your own changed version of this file, please add to your global.ini lines similar to this:
[ha_dr_provider_<HA_STONITH_Hook>]
provider = <HA_STONITH_Hook>
path = /hana/shared/HANA_Hooks
execution_order = 50
For all hooks, 0 must be returned in case of success.
"""
from hdb_ha_dr.client import HADRBase, Helper
import os, time
class HA_STONITH_Hook(HADRBase):
def __init__(self, *args, **kwargs):
# delegate construction to base class
super(HA_STONITH_Hook, self).__init__(*args, **kwargs)
def about(self):
return {"provider_company" : "DellEMC",
"provider_name" : "HA_STONITH_Hook", # provider name = class name
"provider_description" : "Dell EMC IPMI stonith for HANA",
"provider_version" : "1.0"}
def startup(self, hostname, storage_partition, system_replication_mode, **kwargs):
self.tracer.debug("enter startup hook; %s" % locals())
self.tracer.debug(self.config.toString())
self.tracer.info("leave startup hook")
return 0
def shutdown(self, hostname, storage_partition, system_replication_mode, **kwargs):
self.tracer.debug("enter shutdown hook; %s" % locals())
self.tracer.debug(self.config.toString())
self.tracer.info("leave shutdown hook")
return 0
def failover(self, hostname, storage_partition, system_replication_mode, **kwargs):
self.tracer.debug("enter failover hook; %s" % locals())
self.tracer.debug(self.config.toString())
self.tracer.info("leave failover hook")
return 0
def stonith(self, failingHost, **kwargs):
self.tracer.debug("enter HANA HA stonith hook; %s" % locals())
self.tracer.debug(self.config.toString())
self.tracer.info( "Stonith - rebooting failing host %s" % failingHost)
ipmi_host = "%s-ipmi" % failingHost
# default credentails in command example, update credentials for your environment
power_off = "ipmitool power off -I lanplus -H %s -U root -P calvin" % ipmi_host
power_status = "ipmitool power status -I lanplus -H %s -U root -P calvin" % ipmi_host
power_on = "ipmitool power on -I lanplus -H %s -U root -P calvin" % ipmi_host
#Power off failing host and check power status
(code, output) = Helper._runOsCommand(power_off)
self.tracer.info(output)
time.sleep(10)
(code, output) = Helper._runOsCommand(power_status)
self.tracer.info(output)
if 'Power is off' in output:
msg = "Successfully powered off %s" % failingHost
self.tracer.info(msg)
rc = 0
elif 'Power is on' in output:
msg = "failed to power off %s, will try again" % failingHost
self.tracer.info(msg)
(code, output) = Helper._runOsCommand(power_off)
self.tracer.info(output)
time.sleep(10)
(code, output) = Helper._runOsCommand(power_status)
self.tracer.info(output)
if 'Power is off' in output:
msg = "Successfully powered off %s" % failingHost
self.tracer.info(msg)
rc = 0
elif 'Power is on' in output:
msg = "unable to power off %s - Please CHECK" % failingHost
self.tracer.info(msg)
return 1
#Power back on the failed host
if rc == 0:
(code, output) = Helper._runOsCommand(power_on)
time.sleep(10)
self.tracer.info(output)
(code, output) = Helper._runOsCommand(power_status)
self.tracer.info(output)
if 'Power is on' in output:
msg = "successfully powered on %s" % failingHost
self.tracer.info(msg)
rc = 0
elif 'Power is off' in output:
msg = "unable to power on %s - will try again to power on" % failingHost
self.tracer.info(msg)
(code, output) = Helper._runOsCommand(power_on)
self.tracer.info(output)
time.sleep(10)
(code, output) = Helper._runOsCommand(power_status)
self.tracer.info(output)
if 'Power is off' in output:
msg = "unable to power on %s - Please CHECK" % failingHost
self.tracer.info(msg)
rc = 1
elif 'Power is on' in output:
msg = "Successfully powered on %s - Please CHECK" % failingHost
self.tracer.info(msg)
rc = 0
self.tracer.info("leaving HANA HA stonith hook")
return rc
def preTakeover(self, isForce, **kwargs):
"""Pre takeover hook."""
self.tracer.info("%s.preTakeover method called with isForce=%s" % (self.__class__.__name__, isForce))
if not isForce:
# run pre takeover code
# run pre-check, return != 0 in case of error => will abort takeover
return 0
else:
# possible force-takeover only code
# usually nothing to do here
return 0
def postTakeover(self, rc, **kwargs):
"""Post takeover hook."""
self.tracer.info("%s.postTakeover method called with rc=%s" % (self.__class__.__name__, rc))
if rc == 0:
# normal takeover succeeded
return 0
elif rc == 1:
# waiting for force takeover
return 0
elif rc == 2:
# error, something went wrong
return 0
def srConnectionChanged(self, parameters, **kwargs):
self.tracer.debug("enter srConnectionChanged hook; %s" % locals())
# Access to parameters dictionary
hostname = parameters['hostname']
port = parameters['port']
database = parameters['database']
status = parameters['status']
databaseStatus = parameters['database_status']
systemStatus = parameters['system_status']
timestamp = parameters['timestamp']
isInSync = parameters['is_in_sync']
reason = parameters['reason']
siteName = parameters['siteName']
self.tracer.info("leave srConnectionChanged hook")
return 0
def srReadAccessInitialized(self, parameters, **kwargs):
self.tracer.debug("enter srReadAccessInitialized hook; %s" % locals())
# Access to parameters dictionary
database = parameters['last_initialized_database']
databasesNoReadAccess = parameters['databases_without_read_access_initialized']
databasesReadAccess = parameters['databases_with_read_access_initialized']
timestamp = parameters['timestamp']
allDatabasesInitialized = parameters['all_databases_initialized']
self.tracer.info("leave srReadAccessInitialized hook")
return 0
def srServiceStateChanged(self, parameters, **kwargs):
self.tracer.debug("enter srServiceStateChanged hook; %s" % locals())
# Access to parameters dictionary
hostname = parameters['hostname']
service = parameters['service_name']
port = parameters['service_port']
status = parameters['service_status']
previousStatus = parameters['service_previous_status']
timestamp = parameters['timestamp']
daemonStatus = parameters['daemon_status']
databaseId = parameters['database_id']
databaseName = parameters['database_name']
databaseStatus = parameters['database_status']
self.tracer.info("leave srServiceStateChanged hook")
return 0
Install the HA/DR provider script
Add, configure, and monitor your custom provider scripts in SAP HANA Studio. Install the HA/DR provider script on an SAP HANA system by adding a section called [ha_dr_provider_<classname>] to the global.ini file. Use the following parameters:
For example:
[ha_dr_provider_<HA_STONITH_Hook>]
provider = <HA_STONITH_Hook>
path = /hana/shared/HANA_Hooks
execution_order = 50
Verify the installation of the HA/DR provider script HA_STONITH.Hook.py
All scripts are loaded during the startup phase of the name server. The name server traces files are monitored as general information about the ha_dr_provider, and trace information and return codes are collected.
Perform host autofailovers to ensure that the failovers work as expected and that the STONITH has been run to the failed host. The following figure shows sample output from the nameserver tracefile following a host autofailover and successful operation of the STONITH: