Home > Workload Solutions > SAP > Guides > Dell Validated Design for High Availability for SAP with VMware and Red Hat on Dell PowerEdge Servers > Overview
This chapter describes the architecture of the Dell Red Hat Pacemaker cluster solution for SAP HANA.
SAP solutions consist of multiple components and services that interact to form a comprehensive network of functions, including:
For more information about these terms, see Table 1.
For normal operation, only the ASCS instance is needed. In a failover scenario, the VMware fault tolerance mechanism fails over the active VM state to a shadow VM and continues the operation. While the limit for VMware fault tolerance is four virtual CPU cores, this is enough for at least 4,000 concurrent users. This makes the HA solution easier to set up and maintain, compared to a full ASCS-ERS clustered setup.
In a data center failover scenario, the target workload of the servers must be below 50 percent capacity; otherwise, servers become overloaded when one side fails. In a single data center scenario, the remaining application servers must be able to handle the load of the failed server. Calculate the target load for peak operation by using the formula 1-1/N, where N is the number of application servers in the data center. System administrators must monitor this load during normal operations and add additional hardware for higher server loads.
It can also be beneficial to separate the cluster nodes geographically for protection against site-wide or regional disasters. A clustering solution allows failover between cluster nodes on different cloud subnets or in geographically separated locations.
VMware vSphere monitors the health of the virtualization hosts and restarts the VMs in the event of a host failure. If the resulting downtimes for the SAP Hana database are longer than acceptable, a separate cluster stack for the database is needed to provide faster failover times. A cold start of a database can take up to 20 minutes, which might be considered unacceptable for some customers.
To be effective, an HA solution must monitor the health of all these services. If a service fails or does not perform as expected, the HA solution takes the necessary action to recover that service as quickly as possible. The solution brings services online in the secondary nodes in the correct order to ensure that SAP functions resume quickly, and that data is not corrupted. Failover clustering protection ensures a minimal downtime of services and a minimal impact to end users. In parallel, the database is protected by the replication of its data to a secondary node in the same HA cluster, eliminating the SAN single point of failure.
Dell Technologies recommends using a monitoring solution to ensure that the cluster is kept in a state in which it can fulfill its main function of protecting against failures.
For more information, see tools such as Prometheus and Grafana that enable you to view and report on the status of the cluster and its resources.
The SAP landscape consists of three components:
Several availability concepts protect against disasters. Dell SAP engineering used a two-node Red Hat Linux Enterprise pacemaker cluster with SAP HSR instead of switching a single database instance over to the secondary host. The availability needs of your system determine the best option for your environment.
By using the “syncmem” replication mode, the Dell team achieved failover times in the order of seconds. These times were faster than a complete database startup that would require transferring the data disks to the other host.
Note: The described cluster is not a replacement for a classic backup. This configuration does not prevent data loss when data in the database is deleted by mistake.
In network architecture, redundancy refers to the installation of additional devices to ensure availability in the case of device failure. Dell Technologies recommends using at least two separate networks on an HA configuration. The first network is for the communication between servers, and the second network is for SAP HANA system replication (HSR) as a secondary cluster communication path.