PowerFlex Native Asynchronous Replication RPO with Oracle
Mon, 17 Aug 2020 15:52:45 -0000|
Read Time: 0 minutes
PowerFlex software-defined storage platform provides a reliable, high-performance foundation for mission-critical applications like Oracle databases. In many of these deployments, replication and disaster recovery have become a common practice for protecting critical data and ensuring application uptime. In this blog, I will be discussing strategies for replicating mission-critical Oracle databases using Dell EMC PowerFlex software-defined storage.
The Role of Replication and Disaster Recovery in Enterprise Applications
Customers require Disaster Recovery and Replication capabilities to meet mission-critical business requirements where SLAs require the highest uptime. Customers also want the ability to quickly recover from physical or logical disasters to ensure business continuity in the event of disaster and be able to bring up the applications in minimal time without impact to data. Replication means that the same data is available at multiple locations. For Oracle database environments, it is important to have local and remote replicas of application data which are suitable for testing, development, reporting, and disaster recovery and many other operations. Replication improves the performance and protects the availability of Oracle database application because the data exists in another location. Advantages of having multiple copies of data being present across geographies is that, critical business applications will continue to function if the local Oracle database server experiences a failure.
Replication enables customers in various scenarios such as:
- Disaster Recovery for applications ensuring business continuity
- Distributing with one type of use case such as analytics
- Offloading for mission-critical workloads such as BI, Analytics, Data Warehousing, ERP, MRP, and so on
- Data Migration
- Disaster Recovery testing
PowerFlex Software-Defined Storage – Flexibility Unleashed
PowerFlex is a software-defined storage platform designed to significantly reduce operational and infrastructure complexity, empowering organizations to move faster by delivering flexibility, elasticity, and simplicity with predictable performance and resiliency at scale. The PowerFlex family provides a foundation that combines compute as well as high performance storage resources in a managed unified fabric.
PowerFlex is designed to provide extreme performance and massive scalability up to 1000s of nodes. It can be deployed as a disaggregated storage / compute (two-layer), HCI (single-layer), or a mixed architecture. PowerFlex inclusively supports applications ranging from bare-metal workloads and virtualized machines to cloud-native containerized apps. It is widely used for large-scale mission-critical applications like Oracle database. For information about best practices for deploying Oracle RAC on PowerFlex, see Oracle RAC on PowerFlex rack.
PowerFlex also offers several enterprise-class native capabilities to protect critical data at various levels:
- Storage Disk layer: PowerFlex storage distributed data layout scheme is designed to maximize protection and optimize performance. A single volume is divided into chunks. These chunks will be striped on physical disks throughout the cluster, in a balanced and random manner. Each chunk has a total of two copies for redundancy.
- Fault Sets: By implementing Fault sets, we can ensure the persistent data availability at all time. PowerFlex (previously VxFlex OS) will mirror data for a Fault Set on SDSs that are outside the Fault Set. Thus, availability is assured even if all the servers within one Fault Set fail simultaneously. Fault Sets are subgroup of SDSs installed on host servers within a Protection Domain.
PowerFlex replication overview
PowerFlex software consists of a few important components - Meta Data Manager (MDM), Storage Data Server (SDS), Storage Data Client (SDC) and Storage Data Replicator (SDR). MDM manages the PowerFlex system as a whole, which includes metadata, devices mapping, volumes, snapshots, system capacity, errors and failures, system rebuild and rebalance tasks. SDS is the software component that enables a node to contribute its local storage to the aggregated PowerFlex pool. SDC is a lightweight device driver that exposes PowerFlex volumes as block devices to the applications and hosts. SDR handles the replication activities. PowerFlex has a unique feature called Protection Domain. A Protection Domain is a logical entity that contains a group of SDSs. Each SDS belongs to only one Protection Domain.
Figure 1. PowerFlex asynchronous replication between two systems
Replication occurs between two PowerFlex systems designated as peer systems. These peer systems are connected using LAN or WAN and are physically separated for protection purposes. Replication is defined in scope of a protection domain. All objects which participate in replication are contained in the protection domain, including volumes in Replication Consistency Group (RCG). Journal capacity from storage pools in the protection domain is shared among RCGs in the protection domain.
The SDR handles replication activities and manages I/O of replicated logical volumes. The SDR is deployed on the same server as SDS. Only I/Os from replicated volumes flows through SDR.
Replication Data Flow
Figure 2. PowerFlex replication I/O flow between two systems
- At the source, application I/O are passed from SDS to SDR.
- Application I/O are stored in the source journal space before it is sent to target. SDR packages I/O in bundles and sends them to the target journal space.
- Once the I/O are sent to target journal and get placed in target journal space, they are cleared from source.
- Once I/O are applied to target volumes, they are cleared from destination journal.
- For replicated volumes, SDS communicates to other SDS via SDR. For non-replicated volumes, SDS communicates directly with other SDS.
For detailed information about Architecture Overview, see Dell EMC PowerFlex: Introduction to Replication White Paper.
It is important to note that this approach to replication allows PowerFlex to support replication at extreme scales. As the number of nodes contributing storage are scaled, so are the SDR instances. As a result, this replication mechanism can scale effortlessly from 4 to 1000s of nodes while delivering RPOs as low as 30 seconds and meeting IO and throughput requirements.
Oracle Databases on PowerFlex
The following illustration demonstrates that the volumes participating in replication are grouped to form the Replication Consistency Group (RCG). RCG acts as the logical container for the volumes.
Figure 3. PowerFlex replication with Oracle database
Depending on the scenario, we can create multiple RCGs for each volume pair or combine multiple volume pairs in a single RCG.
In the above Oracle setup, PowerFlex System-1 is the source and PowerFlex System-2 is the destination. For replication to occur between the source and target, the following criteria must be met:
- A volume pair must be created in both source and target.
- Size of volumes in both source and target should be same. However, the volumes can be in different storage pools.
- Volumes are in read-write access mode on the source and read-only access mode in secondary. This is done to maintain data integrity and consistency between two peer systems.
The PowerFlex replication is designed to recover from as low as a 30 seconds RPOs minimizing the data-loss if there is a disaster recovery. During creation of RCG, users can specify RPO starting from 30 seconds to maximum of 60 minutes.
All the operations performed on source will be replicated to destination within the RPO. To ensure RPO compliance, PowerFlex replicates at least twice for every RPO period. For example, setting RPO to 30 seconds means that PowerFlex can immediately return to operation at the target system with only 30 seconds of potential data loss.
The following figures depicts the replication scenario under steady state of workload:
Figure 4. 100% RPO compliance for RPO of 30s for an Oracle database during a steady application workload
Figure 5. Replication dashboard view of PowerFlex
In the case of disaster recovery, the entire application can be up and running by failover to secondary, with less than 30 seconds of data loss.
When we do a planned switchover or failover, the volumes on secondary system are automatically changed to read-write access mode and the volumes on source will be changed to read-only. Consequently, we can bring up Oracle database on secondary by setting up the Oracle environment variables and starting the database.
Once we have RCG in the failover or switchover mode, user can decide how to continue with replication:
- Restore replication: Maintains the replication direction from original source to destination.
- Reverse replication: Changes the direction so that original destination becomes the source and replication will begin from original destination to original source.
PowerFlex also provides various other options:
- Pause and Resume RCG: If there are network issues or user need to perform maintenance of any of the hardware. While paused, any application I/O will be stored at source journal and is replicated to the destination only after the replication is resumed.
- Freeze and Unfreeze RCG: If the user requires consistent snapshot of the source or target volumes. While frozen, replication will still occur between source journal and destination journal, nonetheless the target journal holds on to the data and do not apply them to the target volumes.
PowerFlex native volume replication is a unique solution and provides customers with easy to configure and setup without worrying about disaster.
Irrespective of workload and application, it is designed to support massive scale while providing RPOs as low as 30 seconds.
For more information, please visit: DellTechnologies.com/PowerFlex.
Related Blog Posts
Deploying Microsoft SQL Server Big Data Clusters on OpenShift platform using PowerFlex
Mon, 16 Aug 2021 09:11:34 -0000|
Read Time: 0 minutes
Microsoft SQL Server 2019 introduced a groundbreaking data platform with SQL Server 2019 Big Data Clusters (BDC). SQL Server BDC are designed to solve the big data challenge faced by most organizations today. You can use SQL Server BDC to organize and analyze large volumes of data, you can also combine high value relational data with big data. In this blog, I will describe the deployment of Microsoft SQL Server BDC on an OpenShift container platform using Dell EMC PowerFlex software-defined storage.
PowerFlex (previously VxFlex OS) is the software foundation of PowerFlex software-defined storage. It is a unified compute storage and networking solution delivering scale-out block storage service designed to deliver flexibility, elasticity, and simplicity with predictable high performance and resiliency at scale.
The PowerFlex platform is available in multiple consumption options to help customers meet their project and data center requirements. PowerFlex appliance and PowerFlex rack provide customers comprehensive IT Operations Management (ITOM) and life cycle management (LCM) of the entire infrastructure stack in addition to sophisticated high-performance, scalable, resilient storage services. PowerFlex appliance and PowerFlex rack are the two preferred and proactively marketed consumption options. PowerFlex is also available on VxFlex Ready Nodes for those customers interested in software-defined compliant hardware without the ITOM and LCM capabilities.
PowerFlex software-define storage with unified compute and networking offers flexibility of deployment architecture to help best meet the specific deployment and architectural requirements. PowerFlex can be deployed in a two-layer for asymmetrical scaling of compute and storage for “right-sizing capacities, single-layer (HCI), or in mixed architecture.
OpenShift Container Platform
Red Hat® OpenShift® Container Platform is a platform to deploy and create containerized applications. OpenShift Container Platform provides administrators and developers with the tools they require to deploy and manage applications and services at scale. OpenShift Container Platform offers enterprises full control over their Kubernetes environments, whether they are on-premises or in the public cloud, giving you the freedom to build and run applications anywhere.
Microsoft SQL Server Big Data Clusters Overview
Microsoft SQL Server Big Data Clusters are designed to address big data challenges in a unique way, BDC solves many traditional challenges faced in building big-data and data-lake environments. You can query external data sources, store big data in HDFS managed by SQL Server, or query data from multiple external data sources using the cluster. See an overview of Microsoft SQL Server 2019 Big Data Clusters on the Microsoft page Microsoft SQL Server BDC details and on the GitHub page SQL Server BDC Workshops.
Deploying OpenShift Container Platform on PowerFlex
The OpenShift cluster is configured with three master nodes and eight worker nodes. To install OpenShift Container Platform on PowerFlex, see OpenShift Installation.
The following figure shows the logical architecture of Red Hat OpenShift 4.6.x deployed on PowerFlex. The CSAH node is configured with the required services like DNS, DHCP, HTTP Server, and HA Proxy. It also hosts PowerFlex Gateway and PowerFlex GUI.
The following example shows OpenShift cluster with three master and eight worker nodes.
Once OpenShift installation is complete, CSI 1.4 is deployed on the OCP cluster. The CSI driver controller pod is deployed on one of the worker nodes and there are eight vxflexos-node pods that are deployed across eight worker nodes.
For more information about installation of CSI on OpenShift, see the GitHub page CSI installation.
Deploying Microsoft SQL Server BDC on OpenShift Container Platform
Microsoft SQL Server BDC cluster is deployed using OpenShift Container Platform as shown in the architecture diagram below by following instructions available at installation.
The following steps are performed to deploy Microsoft SQL Server BDC cluster using OpenShift Container Platform:
- The Azure Data CLI is installed on the client machine.
- All the pre-requisites for Microsoft SQL Server BDC on OpenShift cluster are performed. For this solution, openshift-prod was selected as the source for the configuration template from the list of available templates.
- All the OpenShift worker nodes are labeled before the Microsoft SQL Server BDC is installed.
- The control.json and bdc.json files are generated.
- The bdc.json is modified from the default settings to use cluster resources and to address the workload requirements. For example, the bdc.json looks like:
- The SQL image deployed in the control.json is 2019-CU9-ubuntu-16.04. To scale out the BDC resource pools, the number of replicas is adjusted to fully leverage the resources of the cluster. The following figure shows the logical architecture of Microsoft SQL Server BDC on OpenShift Container Platform with PowerFlex:7. SQL Server HA deployment is configured along with two data and two compute pods. Three storage pods are also configured. This type of configuration is used for TPC-C, and TPC-H like deployment as SQL is at HA mode with a single primary and couple of replicas. The following figure shows the pod placements across the eight worker nodes.
- To achieve the performance tuning of Microsoft SQL Server BDC cluster, see Microsoft performance guidelines.
- Tune the Microsoft SQL Server master instance based on the recommended guidelines.
- A testing tool like HammerDB documentation is run to validate the Microsoft SQL Server BDC for TPROC-H queries. HammerDB queries are run against the SQL Master instance.
- Follow the HammerDB best practices for SQL server guidelines to get the optimum performance. Although the results met the performance capabilities of the test system, the purpose of the testing was to validate Microsoft SQL Server BDC cluster and ensure that all best practices are implemented.
The validation was performed with a minimum lab hardware. For 1.2 TB of data loaded into Microsoft SQL Server, the QpH@Size was achieved at 220,800 for five virtual users as shown in the figure below. The overall test was completed for all the users in less than 30 minutes. It was observed that the PowerFlex system was not highly utilized while the test was carried out, including the PowerFlex storage, CPU, and memory, allowing the system to accommodate additional potential workload.
The above test results show that SQL Server BDC deployed in a PowerFlex environment can provide a strong analytics platform for Data Warehousing type operations in addition to Big Data solutions.
To understand SQL Server BDC on upstream Kubernetes platform, see the paper SQL Server 2019 BDC on K8s.
Reducing Oracle TCO: Maximum Efficiency Architecture
Tue, 06 Jul 2021 12:37:56 -0000|
Read Time: 0 minutes
Last year, Dell EMC sponsored the 2020 IOUG Database Priorities Survey. One of the questions was, “What leading factors do you weigh when selecting infrastructure for your Oracle environment?”
The number 1 factor respondents mentioned, was “Cost”. This confirms my own experiences when talking to our customers. High cost is often the main decision factor, followed closely by performance (#2) and a number of other factors, most of which I tend to categorize under the umbrella term “IT Operations”. As you may know from reading some of my other blogposts, I am passionate about achieving maximum efficiency for business applications – which is also the reason for choosing the name of this blog (Dirty Cache) since 2011.
For the rest of this blog post, see https://dirty-cache.com/2021/07/06/reducing-oracle-tco-maximum-efficiency-architecture/