Home > Workload Solutions > SAP > Guides > Dell Validated Design for High Availability for SAP with SUSE Pacemaker Clusters > Overview
This chapter describes the components and architecture of the Dell SUSE Pacemaker cluster solution for SAP HANA.
SAP solutions consist of multiple components and services that interact with one another to form a comprehensive network of functions, including:
For normal operation, only the ASCS instance is needed. In a failover scenario, the newly started ASCS instance receives the current status from the ERS instance and the normal operation can continue within seconds. Without the ERS, a restart of the ASCS instance would result in the loss of all database table locks and the same object would become editable by two people at the same time, potentially resulting in data loss. When one side fails, we start the service automatically on the other server, and switch back as soon the failed node is back online.
Note:
In a data center failover scenario, the targeted workload of the servers must be below 50 percent capacity; otherwise, your servers will be overloaded when one side fails.
In a single data center scenario, the remaining application servers must be able to handle the load of the failed server. The target load for peak operation can be calculated using the formula
1-1/N, where N is the number of application servers in the data center. System administrators must monitor this load during normal operations and add additional hardware when the load of the servers is higher.
SAP landscapes rely on a database element such as SAP HANA, which must be considered for (HA) protection. Your SAP environment may also include the NetWeaver platform on which S4/HANA runs.
To be effective, an HA solution must monitor the health of all these services. If a service does not perform as expected or fails completely, the HA solution takes the necessary action to recover that service as quickly as possible. The solution will bring services online in the secondary nodes and in the correct order to ensure that SAP functions resume quickly and data is not corrupted. Failover clustering protection ensures a minimal downtime of services, and thus minimal impact to end users. In parallel, the database is protected by the replication of its data to a secondary node in the same HA cluster, eliminating the SAN single point of failure.
It can also be beneficial to geographically separate the nodes in the cluster for protection against site-wide or regional disasters. A clustering solution allows failover between cluster nodes on different cloud subnets or in geographically separated locations.
Dell recommends using a monitoring solution to ensure that the cluster is kept in a state in which it can fulfill its main task of protecting against failures. Consider using tools such as Prometheus and Grafana to view and report on the status of the cluster and its resources.
The solution caters for the following use cases:
The SAP landscape consists of three components:
Several availability concepts protect against disasters. Dell SAP engineering used a two-node SLES pacemaker cluster with SAP HANA system replication instead of switching a single database instance over to the secondary host. The availability of your system will determine the best fit option for your environment.
By using the “syncmem” replication mode, the Dell team achieved failover times in the order of seconds. These times were much faster than a complete startup of a database, which would require transferring the data disks to the other host.
Note: A cluster such as the one described here is not a replacement for a classic backup. This configuration will not prevent a data loss when data in the database is deleted by mistake.