PowerFlex Native Asynchronous Replication RPO with Oracle
Mon, 17 Aug 2020 15:52:45 -0000|
Read Time: 0 minutes
PowerFlex software-defined storage platform provides a reliable, high-performance foundation for mission-critical applications like Oracle databases. In many of these deployments, replication and disaster recovery have become a common practice for protecting critical data and ensuring application uptime. In this blog, I will be discussing strategies for replicating mission-critical Oracle databases using Dell EMC PowerFlex software-defined storage.
The Role of Replication and Disaster Recovery in Enterprise Applications
Customers require Disaster Recovery and Replication capabilities to meet mission-critical business requirements where SLAs require the highest uptime. Customers also want the ability to quickly recover from physical or logical disasters to ensure business continuity in the event of disaster and be able to bring up the applications in minimal time without impact to data. Replication means that the same data is available at multiple locations. For Oracle database environments, it is important to have local and remote replicas of application data which are suitable for testing, development, reporting, and disaster recovery and many other operations. Replication improves the performance and protects the availability of Oracle database application because the data exists in another location. Advantages of having multiple copies of data being present across geographies is that, critical business applications will continue to function if the local Oracle database server experiences a failure.
Replication enables customers in various scenarios such as:
- Disaster Recovery for applications ensuring business continuity
- Distributing with one type of use case such as analytics
- Offloading for mission-critical workloads such as BI, Analytics, Data Warehousing, ERP, MRP, and so on
- Data Migration
- Disaster Recovery testing
PowerFlex Software-Defined Storage – Flexibility Unleashed
PowerFlex is a software-defined storage platform designed to significantly reduce operational and infrastructure complexity, empowering organizations to move faster by delivering flexibility, elasticity, and simplicity with predictable performance and resiliency at scale. The PowerFlex family provides a foundation that combines compute as well as high performance storage resources in a managed unified fabric.
PowerFlex is designed to provide extreme performance and massive scalability up to 1000s of nodes. It can be deployed as a disaggregated storage / compute (two-layer), HCI (single-layer), or a mixed architecture. PowerFlex inclusively supports applications ranging from bare-metal workloads and virtualized machines to cloud-native containerized apps. It is widely used for large-scale mission-critical applications like Oracle database. For information about best practices for deploying Oracle RAC on PowerFlex, see Oracle RAC on PowerFlex rack.
PowerFlex also offers several enterprise-class native capabilities to protect critical data at various levels:
- Storage Disk layer: PowerFlex storage distributed data layout scheme is designed to maximize protection and optimize performance. A single volume is divided into chunks. These chunks will be striped on physical disks throughout the cluster, in a balanced and random manner. Each chunk has a total of two copies for redundancy.
- Fault Sets: By implementing Fault sets, we can ensure the persistent data availability at all time. PowerFlex (previously VxFlex OS) will mirror data for a Fault Set on SDSs that are outside the Fault Set. Thus, availability is assured even if all the servers within one Fault Set fail simultaneously. Fault Sets are subgroup of SDSs installed on host servers within a Protection Domain.
PowerFlex replication overview
PowerFlex software consists of a few important components - Meta Data Manager (MDM), Storage Data Server (SDS), Storage Data Client (SDC) and Storage Data Replicator (SDR). MDM manages the PowerFlex system as a whole, which includes metadata, devices mapping, volumes, snapshots, system capacity, errors and failures, system rebuild and rebalance tasks. SDS is the software component that enables a node to contribute its local storage to the aggregated PowerFlex pool. SDC is a lightweight device driver that exposes PowerFlex volumes as block devices to the applications and hosts. SDR handles the replication activities. PowerFlex has a unique feature called Protection Domain. A Protection Domain is a logical entity that contains a group of SDSs. Each SDS belongs to only one Protection Domain.
Figure 1. PowerFlex asynchronous replication between two systems
Replication occurs between two PowerFlex systems designated as peer systems. These peer systems are connected using LAN or WAN and are physically separated for protection purposes. Replication is defined in scope of a protection domain. All objects which participate in replication are contained in the protection domain, including volumes in Replication Consistency Group (RCG). Journal capacity from storage pools in the protection domain is shared among RCGs in the protection domain.
The SDR handles replication activities and manages I/O of replicated logical volumes. The SDR is deployed on the same server as SDS. Only I/Os from replicated volumes flows through SDR.
Replication Data Flow
Figure 2. PowerFlex replication I/O flow between two systems
- At the source, application I/O are passed from SDS to SDR.
- Application I/O are stored in the source journal space before it is sent to target. SDR packages I/O in bundles and sends them to the target journal space.
- Once the I/O are sent to target journal and get placed in target journal space, they are cleared from source.
- Once I/O are applied to target volumes, they are cleared from destination journal.
- For replicated volumes, SDS communicates to other SDS via SDR. For non-replicated volumes, SDS communicates directly with other SDS.
For detailed information about Architecture Overview, see Dell EMC PowerFlex: Introduction to Replication White Paper.
It is important to note that this approach to replication allows PowerFlex to support replication at extreme scales. As the number of nodes contributing storage are scaled, so are the SDR instances. As a result, this replication mechanism can scale effortlessly from 4 to 1000s of nodes while delivering RPOs as low as 30 seconds and meeting IO and throughput requirements.
Oracle Databases on PowerFlex
The following illustration demonstrates that the volumes participating in replication are grouped to form the Replication Consistency Group (RCG). RCG acts as the logical container for the volumes.
Figure 3. PowerFlex replication with Oracle database
Depending on the scenario, we can create multiple RCGs for each volume pair or combine multiple volume pairs in a single RCG.
In the above Oracle setup, PowerFlex System-1 is the source and PowerFlex System-2 is the destination. For replication to occur between the source and target, the following criteria must be met:
- A volume pair must be created in both source and target.
- Size of volumes in both source and target should be same. However, the volumes can be in different storage pools.
- Volumes are in read-write access mode on the source and read-only access mode in secondary. This is done to maintain data integrity and consistency between two peer systems.
The PowerFlex replication is designed to recover from as low as a 30 seconds RPOs minimizing the data-loss if there is a disaster recovery. During creation of RCG, users can specify RPO starting from 30 seconds to maximum of 60 minutes.
All the operations performed on source will be replicated to destination within the RPO. To ensure RPO compliance, PowerFlex replicates at least twice for every RPO period. For example, setting RPO to 30 seconds means that PowerFlex can immediately return to operation at the target system with only 30 seconds of potential data loss.
The following figures depicts the replication scenario under steady state of workload:
Figure 4. 100% RPO compliance for RPO of 30s for an Oracle database during a steady application workload
Figure 5. Replication dashboard view of PowerFlex
In the case of disaster recovery, the entire application can be up and running by failover to secondary, with less than 30 seconds of data loss.
When we do a planned switchover or failover, the volumes on secondary system are automatically changed to read-write access mode and the volumes on source will be changed to read-only. Consequently, we can bring up Oracle database on secondary by setting up the Oracle environment variables and starting the database.
Once we have RCG in the failover or switchover mode, user can decide how to continue with replication:
- Restore replication: Maintains the replication direction from original source to destination.
- Reverse replication: Changes the direction so that original destination becomes the source and replication will begin from original destination to original source.
PowerFlex also provides various other options:
- Pause and Resume RCG: If there are network issues or user need to perform maintenance of any of the hardware. While paused, any application I/O will be stored at source journal and is replicated to the destination only after the replication is resumed.
- Freeze and Unfreeze RCG: If the user requires consistent snapshot of the source or target volumes. While frozen, replication will still occur between source journal and destination journal, nonetheless the target journal holds on to the data and do not apply them to the target volumes.
PowerFlex native volume replication is a unique solution and provides customers with easy to configure and setup without worrying about disaster.
Irrespective of workload and application, it is designed to support massive scale while providing RPOs as low as 30 seconds.
For more information, please visit: DellTechnologies.com/PowerFlex.
Related Blog Posts
What’s New with Dell Unity OE Version 5.2?
Tue, 10 May 2022 21:06:27 -0000|
Read Time: 0 minutes
Did you know that Dell Unity had a new software release on April 29, 2022? No? With all the hype of Dell Technologies World 2022, which included a number of new products, software, and company partnership announcements, Dell Unity may have flown under the radar with one of the most important software updates in its history. But fear not, I have you covered. This software release delivers several important storage technologies designed to simplify how users address capacity expansion and recover data faster, expand disaster recovery topologies, increase storage utilization, and cost-effectively upgrade Dell Unity XT systems while lowering capital and operating expenses. The following sections highlight the updates in this release.
Unity XT data-in-place conversions
So, you purchased a Unity XT 480/F or 680/F system, but now it’s under-sized and handling more than you planned for. With storage architects and administrators trying to do more with less, it is not uncommon that new requirements come up and a storage system is pushed to its limits. Whether you are maxing out performance and capacity, or are looking for higher limits, a data-in-place (DIP) conversion may be just what is needed.
In OE version 5.2, Hybrid (HFA) or All Flash (AFA) Unity XT 480 and 680 systems can be upgraded to a higher Dell Unity model of the same type. This means that an HFA system can upgrade to a higher model HFA system, and an AFA system can be upgraded to a higher model AFA system. For the Unity XT 480, it can be directly upgraded to a Unity XT 680 or 880 model system. The Unity XT 680 can be upgraded to the Unity XT 880 model system. This upgrade not only increases the performance potential of the Storage Processors, but the higher models also include higher system limits. Data-in-place upgrades reuse the same I/O modules, SFPs, and power supplies from the replaced Storage Processors, and can either be completed online or offline.
During an online upgrade, one Storage Processor (SP) is replaced at a time. All hosts/servers that are configured with high availability can remain online and run I/O to the system. This procedure closely mimics a software non-disruptive upgrade where one SP is upgraded at a time. An offline DIP upgrade needs to occur during complete application downtime. Although the system is offline, this upgrade process does complete faster because both storage processors are replaced at the same time. After completing the upgrade, the model will be seen in Unisphere, Unisphere CLI, and REST API as the new model, as if it came from the factory.
I/O module conversions
Performance is always impacted by the slowest component in an environment, almost like the weakest link in a chain. For some, infrastructure within an environment may be upgraded, leaving the storage system limited due to its current configuration. Do you have your Unity XT 16Gb Fibre Channel front end ports connected to a 32Gb switch? If your answer is Yes, then why not upgrade the storage too? OE version 5.2 supports upgrading 16Gb Fibre Channel modules in existing Unity XT systems to the 32Gb Fibre Channel modules.
What used to be a non-data-in-place conversion can now be achieved through a new service script without impacting the data on the system. During the upgrade procedure, the Fibre Channel I/O modules are upgraded in an NDU manner. One Fibre Channel module is replaced in one storage processor, while any I/O to the system is serviced by the peer SP. After the upgrade is complete and the system is utilizing the new 32 Gb I/O modules and SFPs, the front end ports on the Unity XT system are no longer the weakest link.
For more information about Unity XT data-in-place upgrades or I/O module conversions, see the Dell Unity XT: Introduction to the Platform white paper.
Hybrid flash system enhancements
Dell Unity hybrid systems continue to be a compelling storage solution for small to mid-size enterprises, supporting general purpose workloads that don’t need the speed and low latency of All Flash or NVMe architectures. The following are software enhancements for hybrid systems in OE version 5.2.
Dynamic pools were first introduced in OE version 4.2 for All Flash systems. This pool type drops the traditional pool RAID Group based configuration in favor of advanced RAID techniques and distributed sparing. Dynamic pools allow for better storage utilization than the previous pool type, and more simplified planning. Users no longer need to add multiples of drive sets to achieve a particular capacity. In most cases, a dynamic pool can be expanded with a single drive.
Just like my youngest child watching his older brother do things he can’t, hybrid systems have been left wondering when it will be their turn. With OE version 5.2, the time has come. In OE version 5.2, dynamic pools can be created on any hybrid flash system, not just Unity XT if that is what you are thinking. A dynamic pool can also be a single drive type or include multiple drive types just like their traditional pool counterparts. For each drive type within a dynamic pool, a minimum drive count is required, but expanding an existing tier with a single drive is possible in most situations. Dynamic pools also support FAST VP and FAST Cache on hybrid systems.
For more details see the Dell Unity: Dynamic Pools white paper.
Do you have a hybrid Unity XT system and could use some extra capacity? What if I told you, it’s free? While I can’t send you free drives through a blog, I can let you know that data reduction and advanced deduplication are now supported on hybrid pools within hybrid model Unity XT systems.
To support Data Reduction, the pool must contain a flash tier whose total usable capacity meets or exceeds 10% of the total pool capacity. Once the system is running OE version 5.2, you can enable data reduction with and without advanced deduplication on existing resources or new resources. The pool type can also either be traditional or dynamic.
For more details see the Dell Unity: Data Reduction white paper.
Are you feeling limited by the replication topologies that are currently supported? The OE version 5.2 release may just have what you need to support the file replication topology of your dreams, or maybe just what your data protection requirements dictate. In this release, each file resource supports a maximum of four replication sessions, which includes the inbound replication sessions and the outbound replication sessions. What’s different in this release is that one of the four replication sessions can be synchronous, and you can also create a replication session outbound from a synchronous replication destination. The picture below shows a replication topology that is now possible using the OE version 5.2 release. Note: All systems within the topology must be running version OE version 5.2 or later to support these new topologies.
For more details see the Dell Unity: Replication Technologies white paper.
Have you ever had an IP address change in your environment, which required you to update each component within the environment that used it? I have, and even though I only use a test lab, it still caused me heartburn. Depending on the environment, the changes required can be a long and tedious process.
In OE version 5.2, an update to LDAP can save you some time in the future. Now when configuring LDAP server addresses manually on a NAS Server, users can either enter the LDAP server IP or Fully Qualified Domain Name (FQDN). By entering the FQDN, an IP change on the LDAP server no longer requires changing the LDAP configuration on each NAS server. The NAS server automatically picks up the new IP using DNS. Taking the time to update an existing configuration can save a bunch of time in the future, especially when the change needs to occur late at night or on weekends.
For more details see the Dell Unity: NAS Capabilities white paper.
I’ve outlined just a few of the major features in the Dell Unity OE version 5.2 release. For more information about other features in this release, check out these resources:
Author: Ryan Poulin
How PowerFlex Transforms Big Data with VMware Tanzu Greenplum
Wed, 13 Apr 2022 13:16:23 -0000|
Read Time: 0 minutes
Quick! The word has just come down. There is a new initiative that requires a massively parallel processing (MPP) database, and you are in charge of implementing it. What are you going to do? Luckily, you know the answer. You also just discovered that the Dell PowerFlex Solutions team has you covered with a solutions guide for VMware Tanzu Greenplum.
What is in the solutions guide and how will it help with an MPP database? This blog provides the answer. We look at what Greenplum is and how to leverage Dell PowerFlex for both the storage and compute resources in Greenplum.
Infrastructure flexibility: PowerFlex
If you have read my other blogs or are familiar with PowerFlex, you know it has powerful transmorphic properties. For example, PowerFlex nodes sometimes function as both storage and compute, like hyperconverged infrastructure (HCI). At other times, PowerFlex functions as a storage-only (SO) node or a compute-only (CO) node. Even more interesting, these node types can be mixed and matched in the same environment to meet the needs of the organization and the workloads that they run.
This transmorphic property of PowerFlex is helpful in a Greenplum deployment, especially with the configuration described in the solutions guide. Because the deployment is built on open-source PostgreSQL, it is optimized for the needs of an MPP database, like Greenplum. PowerFlex can deliver the compute performance necessary to support massive data IO with its CO nodes. The PowerFlex infrastructure can also support workloads running on CO nodes or nodes that combine compute and storage (hybrid nodes). By leveraging the malleable nature of PowerFlex, no additional silos are needed in the data center, and it may even help remove existing ones.
The architecture used in the solutions guide consists of 12 CO nodes and 10 SO nodes. The CO nodes have VMware ESXi installed on them, with Greenplum instances deployed on top. There are 10 segments and one director deployed for the Greenplum environment. The 12th CO node is used for redundancy.
The storage tier uses the 10 SO nodes to deliver 12 volumes backed by SSDs. This configuration creates a high speed, highly redundant storage system that is needed for Greenplum. Also, two protection domains are used to provide both primary and mirror storage for the Greenplum instances. Greenplum mirrors the volumes between those protection domains, adding an additional level of protection to the environment, as shown in the following figure:
By using this fluid and composable architecture, the components can be scaled independently of one another, allowing for storage to be increased either independently or together with compute. Administrators can use this configuration to optimize usage and deliver appropriate resources as needed without creating silos in the environment.
Testing and validation with Greenplum: we have you covered
The solutions guide not only describes how to build a Greenplum environment, it also addresses testing, which many administrators want to perform before they finish a build. The guide covers performing basic validations with FIO and gpcheckperf. In the simplest terms, these tools ensure that IO, memory, and network performance are acceptable. The FIO tests that were run for the guide showed that the HBA was fully saturated, maximizing both read and write operations. The gpcheckperf testing showed a performance of 14,283.62 MB/sec for write workloads.
Wouldn’t you feel better if a Greenplum environment was tested with a real-world dataset? That is, taking it beyond just the minimum, maximum, and average numbers? The great news is that the architecture was tested that way! Our Dell Digital team has developed an internal test suite running static benchmarked data. This test suite is used at Dell Technologies across new Greenplum environments as the gold standard for new deployments.
In this test design, all the datasets and queries are static. This scenario allows for a consistent measurement of the environment from one run to the next. It also provides a baseline of an environment that can be used over time to see how its performance has changed -- for example, if the environment sped up or slowed down following a software update.
Massive performance with real data
So how did the architecture fare? It did very well! When 182 parallel complex queries were run simultaneously to stress the system, it took just under 12 minutes for the test to run. In that time, the environment had a read bandwidth of 40 GB/s and a write bandwidth of 10 GB/s. These results are using actual production-based queries from the Dell Digital team workload. These results are close to saturating the network bandwidth for the environment, which indicates that there are no storage bottlenecks.
The design covered in this solution guide goes beyond simply verifying that the environment can handle the workload; it also shows how the configuration can maintain performance during ongoing operations.
Maintaining performance with snapshots
One of the key areas that we tested was the impact of snapshots on performance. Snapshots are a frequent operation in data centers and are used to create test copies of data as well as a source for backups. For this reason, consider the impact of snapshots on MPP databases when looking at an environment, not just how fast the database performs when it is first deployed.
In our testing, we used the native snapshot capabilities of PowerFlex to measure the impact that snapshots have on performance. Using PowerFlex snapshots provides significant flexibility in data protection and cloning operations that are commonly performed in data centers.
We found that when the first storage-consistent snapshot of the database volumes was taken, the test took 45 seconds longer to complete than initial tests. This result was because it was the first snapshot of the volumes. Follow-on snapshots during testing resulted in minimal impact to the environment. This minimal impact is significant for MPP databases in which performance is important. (Of course, performance can vary with each deployment.)
We hope that these findings help administrators who are building a Greenplum environment feel more at ease. You not only have a solution guide to refer to as you architect the environment, you can be confident that it was built on best-in-class infrastructure and validated using common testing tools and real-world queries.
The bottom line
Now that you know the assignment is coming to build an MPP database using VMware Tanzu Greenplum -- are you up to the challenge?
If you are, be sure to read the solution guide. If you need additional guidance on building your Greenplum environment on PowerFlex, be sure to reach out to your Dell representative.