PowerFlex Native Asynchronous Replication RPO with Oracle
Mon, 17 Aug 2020 15:52:45 -0000|
Read Time: 0 minutes
PowerFlex software-defined storage platform provides a reliable, high-performance foundation for mission-critical applications like Oracle databases. In many of these deployments, replication and disaster recovery have become a common practice for protecting critical data and ensuring application uptime. In this blog, I will be discussing strategies for replicating mission-critical Oracle databases using Dell EMC PowerFlex software-defined storage.
The Role of Replication and Disaster Recovery in Enterprise Applications
Customers require Disaster Recovery and Replication capabilities to meet mission-critical business requirements where SLAs require the highest uptime. Customers also want the ability to quickly recover from physical or logical disasters to ensure business continuity in the event of disaster and be able to bring up the applications in minimal time without impact to data. Replication means that the same data is available at multiple locations. For Oracle database environments, it is important to have local and remote replicas of application data which are suitable for testing, development, reporting, and disaster recovery and many other operations. Replication improves the performance and protects the availability of Oracle database application because the data exists in another location. Advantages of having multiple copies of data being present across geographies is that, critical business applications will continue to function if the local Oracle database server experiences a failure.
Replication enables customers in various scenarios such as:
- Disaster Recovery for applications ensuring business continuity
- Distributing with one type of use case such as analytics
- Offloading for mission-critical workloads such as BI, Analytics, Data Warehousing, ERP, MRP, and so on
- Data Migration
- Disaster Recovery testing
PowerFlex Software-Defined Storage – Flexibility Unleashed
PowerFlex is a software-defined storage platform designed to significantly reduce operational and infrastructure complexity, empowering organizations to move faster by delivering flexibility, elasticity, and simplicity with predictable performance and resiliency at scale. The PowerFlex family provides a foundation that combines compute as well as high performance storage resources in a managed unified fabric.
PowerFlex is designed to provide extreme performance and massive scalability up to 1000s of nodes. It can be deployed as a disaggregated storage / compute (two-layer), HCI (single-layer), or a mixed architecture. PowerFlex inclusively supports applications ranging from bare-metal workloads and virtualized machines to cloud-native containerized apps. It is widely used for large-scale mission-critical applications like Oracle database. For information about best practices for deploying Oracle RAC on PowerFlex, see Oracle RAC on PowerFlex rack.
PowerFlex also offers several enterprise-class native capabilities to protect critical data at various levels:
- Storage Disk layer: PowerFlex storage distributed data layout scheme is designed to maximize protection and optimize performance. A single volume is divided into chunks. These chunks will be striped on physical disks throughout the cluster, in a balanced and random manner. Each chunk has a total of two copies for redundancy.
- Fault Sets: By implementing Fault sets, we can ensure the persistent data availability at all time. PowerFlex (previously VxFlex OS) will mirror data for a Fault Set on SDSs that are outside the Fault Set. Thus, availability is assured even if all the servers within one Fault Set fail simultaneously. Fault Sets are subgroup of SDSs installed on host servers within a Protection Domain.
PowerFlex replication overview
PowerFlex software consists of a few important components - Meta Data Manager (MDM), Storage Data Server (SDS), Storage Data Client (SDC) and Storage Data Replicator (SDR). MDM manages the PowerFlex system as a whole, which includes metadata, devices mapping, volumes, snapshots, system capacity, errors and failures, system rebuild and rebalance tasks. SDS is the software component that enables a node to contribute its local storage to the aggregated PowerFlex pool. SDC is a lightweight device driver that exposes PowerFlex volumes as block devices to the applications and hosts. SDR handles the replication activities. PowerFlex has a unique feature called Protection Domain. A Protection Domain is a logical entity that contains a group of SDSs. Each SDS belongs to only one Protection Domain.
Figure 1. PowerFlex asynchronous replication between two systems
Replication occurs between two PowerFlex systems designated as peer systems. These peer systems are connected using LAN or WAN and are physically separated for protection purposes. Replication is defined in scope of a protection domain. All objects which participate in replication are contained in the protection domain, including volumes in Replication Consistency Group (RCG). Journal capacity from storage pools in the protection domain is shared among RCGs in the protection domain.
The SDR handles replication activities and manages I/O of replicated logical volumes. The SDR is deployed on the same server as SDS. Only I/Os from replicated volumes flows through SDR.
Replication Data Flow
Figure 2. PowerFlex replication I/O flow between two systems
- At the source, application I/O are passed from SDS to SDR.
- Application I/O are stored in the source journal space before it is sent to target. SDR packages I/O in bundles and sends them to the target journal space.
- Once the I/O are sent to target journal and get placed in target journal space, they are cleared from source.
- Once I/O are applied to target volumes, they are cleared from destination journal.
- For replicated volumes, SDS communicates to other SDS via SDR. For non-replicated volumes, SDS communicates directly with other SDS.
For detailed information about Architecture Overview, see Dell EMC PowerFlex: Introduction to Replication White Paper.
It is important to note that this approach to replication allows PowerFlex to support replication at extreme scales. As the number of nodes contributing storage are scaled, so are the SDR instances. As a result, this replication mechanism can scale effortlessly from 4 to 1000s of nodes while delivering RPOs as low as 30 seconds and meeting IO and throughput requirements.
Oracle Databases on PowerFlex
The following illustration demonstrates that the volumes participating in replication are grouped to form the Replication Consistency Group (RCG). RCG acts as the logical container for the volumes.
Figure 3. PowerFlex replication with Oracle database
Depending on the scenario, we can create multiple RCGs for each volume pair or combine multiple volume pairs in a single RCG.
In the above Oracle setup, PowerFlex System-1 is the source and PowerFlex System-2 is the destination. For replication to occur between the source and target, the following criteria must be met:
- A volume pair must be created in both source and target.
- Size of volumes in both source and target should be same. However, the volumes can be in different storage pools.
- Volumes are in read-write access mode on the source and read-only access mode in secondary. This is done to maintain data integrity and consistency between two peer systems.
The PowerFlex replication is designed to recover from as low as a 30 seconds RPOs minimizing the data-loss if there is a disaster recovery. During creation of RCG, users can specify RPO starting from 30 seconds to maximum of 60 minutes.
All the operations performed on source will be replicated to destination within the RPO. To ensure RPO compliance, PowerFlex replicates at least twice for every RPO period. For example, setting RPO to 30 seconds means that PowerFlex can immediately return to operation at the target system with only 30 seconds of potential data loss.
The following figures depicts the replication scenario under steady state of workload:
Figure 4. 100% RPO compliance for RPO of 30s for an Oracle database during a steady application workload
Figure 5. Replication dashboard view of PowerFlex
In the case of disaster recovery, the entire application can be up and running by failover to secondary, with less than 30 seconds of data loss.
When we do a planned switchover or failover, the volumes on secondary system are automatically changed to read-write access mode and the volumes on source will be changed to read-only. Consequently, we can bring up Oracle database on secondary by setting up the Oracle environment variables and starting the database.
Once we have RCG in the failover or switchover mode, user can decide how to continue with replication:
- Restore replication: Maintains the replication direction from original source to destination.
- Reverse replication: Changes the direction so that original destination becomes the source and replication will begin from original destination to original source.
PowerFlex also provides various other options:
- Pause and Resume RCG: If there are network issues or user need to perform maintenance of any of the hardware. While paused, any application I/O will be stored at source journal and is replicated to the destination only after the replication is resumed.
- Freeze and Unfreeze RCG: If the user requires consistent snapshot of the source or target volumes. While frozen, replication will still occur between source journal and destination journal, nonetheless the target journal holds on to the data and do not apply them to the target volumes.
PowerFlex native volume replication is a unique solution and provides customers with easy to configure and setup without worrying about disaster.
Irrespective of workload and application, it is designed to support massive scale while providing RPOs as low as 30 seconds.
For more information, please visit: DellTechnologies.com/PowerFlex.
Related Blog Posts
A Case for Repatriating High-value Workloads with PowerFlex Software-Defined Storage
Wed, 26 Aug 2020 18:33:51 -0000|
Read Time: 0 minutes
Kent Stevens, Product Management, PowerFlex
Brian Dean, Senior Principal Engineer, TME, PowerFlex
Michael Richtberg, Chief Strategy Architect, PowerFlex
We observe customers repatriating key applications from the Cloud, help you think about where to run your key applications, and explain how PowerFlex’s unique architecture meets the demands of these workloads in running and transforming your business
For critical software applications you depend upon to power core business and operational processes, moving to “The Cloud” might seem the easiest way to gain the agility to transform the surrounding business processes. Yet we see many of our customers making the move back home, back “On-Prem” for these performance-sensitive critical workloads – or resisting the urge to move to The Cloud in the first place. PowerFlex is proving to deliver agility and ease of operations for the IT infrastructure for high-value, large-scale workloads and data-center consolidation, along with a predictable cost profile – as a Cloud-like environment enabling you to reach your business objectives safely within your own data center or at co-lo facilities.
IDC recently found that 80% of their customers had repatriation activities, and 50% of public-cloud based applications were targeted to move to hosted-private cloud or on-premises locations within two years(1). IDC notes that the main drivers for repatriation are security, performance, cost, and control. Findings reported by 451 Research(2) show cost and performance as the top disadvantages when comparing on-premises storage to cloud storage services. We’ve further observed that core business-critical applications are a significant part of these migration activities.
If you’ve heard the term “data gravity,” which relates to the difficulty in moving data to and from the cloud and that may only be part of the problem. “Application” gravity is likely a bigger problem for performance sensitive workloads that struggle to achieve the required business results because of scale and performance limitations of cloud storage services.
Transformation is the savior of your business – but a problem for your key business applications
Business transformation impacts the data-processing infrastructure in important ways: Applications that were stable and seldom touched are now the subject of massive changes on an ongoing basis. Revamped and intelligent business processes require new pieces of data, increasing the storage requirements and those smarts (the newly automated or augmented decision-making) require constant tuning and adjustments. This is not what you want for applications that power your most important business workflows that generate your profitability. You need maximum control and full purview over this environment to avoid unexpected disruptions. It’s a well-known dilemma that you must change the tires while the car is driving down the road – and today’s transformation projects can take this to the extreme.
The infrastructure used to host such high-profile applications – computing, storage and networking – must be operated at scale yet still be ready to grow and evolve. It must be resilient, remain available when hardware fails, and be able to transform without interruption to the business.
Does the public cloud deliver the results you expected?
Do your applications require certain minimum amounts of throughput? Are there latency thresholds you consider critical? Do you require large data capacities and the ability to scale as demands grow? Do require certain levels of availability? You may assume all these requirements come with a “storage” product offered by the public cloud platforms, but most fall short of meeting these needs. Some require over-provisioning to get better performance. High availability options may be lacking. The highest performing options have capacity scale limitations and can be prohibitively expensive. If you assume what you’ve been using on-prem comes from a hyperscaler, you may be quite surprised that there are substantial gaps that require expensive application rearchitecting to be “cloud native” which may become budget busters. These public cloud attributes can lead to “application gravity” gaps.
While the agility of it is tempting, the unexpected costliness of moving everything to the public cloud has turned back more than one company. When evaluating the economics and business justification for Cloud solutions, many costs associated with full-scale operations, spikes in demand or extended services can be hard to estimate, and can turn out to be large and unpredictable.
The full price of cloud adoption must account for the required levels of resiliency, management infrastructure, storage and analytics for operational data, security solutions, and scaling up the resources to realistic production levels. Recognizing all the necessary services and scale may undermine what might have initially appeared to be a solid cost justification. Once the budget is established, active effort and attention must be devoted to monitoring and oversight. Adapting to unexpected operational events, such as bursting or autoscaling for temporary spikes in workload or traffic, can bring unforeseen leaps in the monthly bill. Such situations can be especially hard to predict and plan for – and very difficult to control.
You want the speed, convenience and elasticity of running in the cloud - but how do you ensure that agility while staying within the necessary bounds of cost and oversight? Truly transformative infrastructure allows businesses to consolidate compute and storage for disparate workloads onto a single unified infrastructure to simplify their environment, increase agility, improve resiliency and lower operational costs. And your potential payoff is big with far easier scaling, more efficient hardware utilization, and less time spent figuring out how to get things right or tracking down issues that complicate disparate system architectures.
Software-Defined is the Future
IDC Predicts that by 2024, software-defined infrastructure solutions will account for 30% of storage solutions(3). At the heart of the PowerFlex family, and the enabler of its flexibility, scale and performance is PowerFlex software-defined storage. The ease and reliability of deployment and operation is provided by PowerFlex Manager, an IT operations and lifecycle management tool for full visibility and control over the PowerFlex infrastructure solutions.
PowerFlex’s unmatched combination of flexibility, elasticity, and simplicity with predictable high performance - at any scale - makes it ideally suited to be the common infrastructure for any company. Utilizing software defined storage (SDS) and hosting multiple heterogeneous computing environments, PowerFlex enables growth, consolidation, and change with cloud-like elasticity – without barriers that could impede your business.
The resulting unique architecture of the PowerFlex family easily meets the large-scale, always-on requirements of our customers’ core enterprise applications. The power and resiliency of the PowerFlex infrastructure platforms handle everything from high-performance enterprise databases, to web-scale transaction processing, to demanding business solutions in various industries including healthcare, utilities and energy. And this includes the new big-data and analytical workloads that are quickly augmenting the core applications as the business processes are being transformed.
PowerFlex: A Unique Platform for Operating and Transforming Critical Applications
PowerFlex provides the flexibility to utilize your choice of tools and solutions to drive your transformation and consolidation, while controlling the costs of the relentless expansion in data processing. PowerFlex provides the modularity to adapt and grow efficiently while providing the manageability to simplify your operations and reduce costs. It provides the scalable infrastructure on-premises to allow you focus on your business operations. PowerFlex on-demand options by the end of 2020 enable an elastic OPEX consumption model as well.
As your business needs change, PowerFlex provides a non-disruptive path of adaptability. As you need more compute, storage or application workloads, PowerFlex modularly expands without complex data migration services. As your application infrastructure needs change from virtualization to containers and bare metal, PowerFlex can mix and match these in any combination necessary without needing physical changes or cluster segmentation. PowerFlex provides future-proof capabilities that keep up with your demands with six nines of availability and linear scalability.
With the dynamic new pace of growth and change, PowerFlex can ensure you stay in charge while enabling the agility to adapt efficiently. PowerFlex enables you to leverage the advantages of oversight and cost-effectiveness of the on-premises environment with the ability to meet transformation head-on.
1 IDC Cloud Repatriation Accelerates in a Multi-Cloud World, July 2018
2 451 Research, 2020 Voice of the Enterprise
3 IDC FutureScape: Worldwide Enterprise Infrastructure 2020 Predictions, October 2019
Oracle Database Solutions on Docker Container and Kubernetes
Tue, 25 Aug 2020 18:51:56 -0000|
Read Time: 0 minutes
Oracle Database Solutions on Docker Container and Kubernetes
Containers are a lightweight, stand-alone, executable package of software that includes everything that is needed to run an application: code, runtime, system tools, system libraries, and settings. A container isolates software from its environment and ensures that it works uniformly despite any differences between development and staging. Containers share the machine’s operating system kernel and do not require an operating system for each application, driving higher server efficiencies and reducing server and licensing costs.
The traditional build process for database application development is complex, time intensive and difficult to schedule. With containers and the right supporting tools, the traditional build process is transformed into a self-service, on-demand experience that enables developers to rapidly deploy applications. In the remaining sections of this article we describe how to develop the capability to have an Oracle database container running in a matter of minutes.
Oracle has a long commitment to supporting the developer communities working in containerized environments. At the DockerCon US event in April 2017, Oracle announced that its Oracle 12c database software application would be available alongside of other Oracle products on Docker Store, the standard for dev-ops developers. Dev-ops developers have pulled over four billion images from the Docker Store and are increasingly turning to the Docker Store as the canonical source for high-quality curated content. In the present-day database world, customers are invariably switching to the use of containers with Kubernetes management to build and run a wide variety of applications and services in a highly available on-premises hosted environment.
Containerized environments can reliably offer high-performance compute, storage and network capabilities with the necessary configurations. A containerized environment also reduces overhead costs by providing a repeatable process for application deployment across build, test, and production systems. To enable the deployment and management of containerized applications, organizations use Kubernetes technologies to operate at any scale including production. Kubernetes enables powerful collaboration and workflow management capabilities by deploying containers for cloud-native, distributed applications and microservices. It even allows you to repackage legacy applications for increased portability, more efficient deployment, and improved customer and employee engagement.
Figure 1: Docker containers for reducing development complexity
For many companies, to boost productivity and time to value, container usage starts with the departments that are focused on software development. Their journey typically starts with installing, implementing, and using containers for applications that are based on the microservice architecture as shown in Figure 2. Developers want to be able to build microservices-based container applications without changing code or infrastructure.
This approach enables portability between data centers and obviates the need for changes in traditional applications enabling faster development and deployment cycles. Oracle Docker containers run the microservices while Kubernetes is used for container orchestration. Also, the microservices running within Docker containers can communicate with the Oracle databases by using messaging services.
Figure 2: Architecture for Oracle Database featuring Docker and Kubernetes
Using orchestration and automation for containerized applications, developers can self-provision an Oracle database, thereby increasing flexibility and productivity while saving substantial time in creating a production copy for development and testing environments. This solution enables development teams to quickly provision isolated applications without the traditional complexities.
Our Dell EMC engineers recently tested and validated a solution for Oracle database using Docker containers and Kubernetes. The solution uses Oracle Database in containers, Kubernetes, and the Container Storage Interface (CSI) Driver for Dell EMC PowerFlex OS to show how dev/ops teams can transform their development processes.
Dell EMC engineers demonstrated two use cases for this solution. Both of our use cases feature four Dell EMC PowerEdge R640 servers, which are an integral part of Dell EMC VxFlex Ready Nodes, and a CSI Driver for Dell EMC PowerFlex that were hosted in our DellEMC labs.
Use Case 1
In use case 1, the DellEMC engineers manually provisioned the container-based development and testing environment shown in Figure 3 as follows:
- Install Docker.
- Activate the Docker Enterprise Edition-License.
- Run the Oracle 12c database within the Docker container.
- Build and run the Oracle 19c database in the Docker container.
- Import the sample Oracle schemas that are pulled from GitHub into the Oracle 12c and 19c database.
- Install Oracle SQL Developer and query tables from the container to demonstrate that the connection from Oracle SQL Developer to Oracle database functions.
Figure 3: Use Case 1 - Architecture
The key benefit of our first use case was the time that we saved by using Docker containers instead of the traditional manual installation and configuration method of building a typical Oracle database environment. Use Case 1 planning also demonstrated the importance of selecting the Docker registry location and storage provisioning options that are most appropriate for the requirements of a typical development and testing environment.
Use Case 2
Use Case 2 demonstrates the value of CSI plug-in integration with Kubernetes and Dell EMC Power Flex storage to automate storage configuration. Kubernetes orchestration with PowerFlex provides a container deployment strategy with persistent storage. It demonstrates the ease, simplicity, and speed in scaling out a development and testing environment from production Oracle databases. In this use case, a developer provisions the Oracle database in containers on the same infrastructure described in Use Case 1 only this time using Kubernetes with the CSI Driver for Dell EMC PowerFlex. Figure 4 depicts the detailed architecture of Use Case 2.
Figure 4. Use Case 2 – Architecture
Use Case 2 demonstrates how Docker, Kubernetes, and the CSI Driver for Dell EMC PowerFlex accelerate the development life cycle for Oracle applications. Kubernetes configured with the CSI Driver for Dell EMC VxFlex OS simplified and automated the provisioning and removal of containers with persistent storage. Engineers used yaml configuration files along with the kubectl command to quickly deploy and delete containers and complete pods. Our solution demonstrates that developers can provision Oracle databases in containers without the complexities that are associated with installing the database and provisioning storage.
Use Case Observations and Benefits
Adding Kubernetes container orchestration is an essential addition for database developers on a containerized development journey. Automation becomes essential with the expansion of containerized application deployments. In this case, it enabled our developers to bypass the complexities that are associated with plain scripting. Instead, our solution uses open source Kubernetes to accomplish the developer’s objectives. The CSI plug-in integrates with Kubernetes and exposes the capabilities of the Dell EMC PowerFlex storage system, enabling the developer to:
- Take a snapshot of the Oracle database, including the sample schema that was pulled from the GitHub site.
- Protect the work of the existing Oracle database, which was changed before taking the snapshot. We can protect any state. Use the CSI plug-in Driver for Dell EMC PowerFlex OS to create a snapshot that is installed in Kubernetes to provide persistent storage.
- Restore an Oracle 19c database to its pre-deletion state using a snapshot, even after removing the containers and the attached storage.
In our second use case, using Kubernetes combined with the CSI Driver for DellEMC PowerFlex OS simplified and automated the provisioning and removal of containers and storage. In this use case, we used yaml files along with the kubectl command to deploy and delete the containers and pods. All these components facilitate the automation of the container hosting the Oracle database on the top of PowerFlex.
Kubernetes, enhanced with the CSI Driver for Dell EMC VxFlex OS, provides the capability to attach and manage Dell EMC VxFlex OS storage system volumes to containerized applications. Our developers worked with a familiar Kubernetes interface to modify a copy of Oracle database schema gathered from the Github repository database and connect it to the Oracle database container. After modifying the database, the developer protected all progress by using the snapshot feature of Dell EMC VxFlex OS storage system and creating a point-in-time copy of the database.
Comparing Use Case 1 to Use Case 2 demonstrated how we can easily shift away from the complexities of scripting and using the command line to implement a self-service model that accelerates container management. The move to a self-service model, which increases developer productivity by removing bottlenecks, becomes increasingly important as the Docker container environment grows.
The power of containers and automation show how tasks that traditionally required multiple roles—developers and others working with the storage and database administrators - can be simplified. Kubernetes with the CSI plug-in enables developers and others to do more in less time and with fewer complexities. The time savings means that coding projects can be completed faster, benefiting both the developers and the business-side employees and customers. Overall, the key benefit shown in comparing our two use cases was the transformation from a manually managed container environment to an orchestrated system with more capabilities.
Innovation drives transformation. In the case of Docker containers and Kubernetes, the key benefit is a shift to rapid application deployment services. Oracle and many others have embraced containers and provide images of applications, such as for the Oracle 12c database, that can be deployed in days and instantiated in seconds. Installations and other repetitive tasks are replaced with packaged applications that enable the developer to work quickly in the database. The ease of using Docker and Kubernetes, combined with rapid provisioning of persistent storage, transforms development by removing wait time and enabling the developer to move closer to the speed of thought.
The addition of the Kubernetes orchestration system and the CSI Driver for Dell EMC VxFlex OS brings a rich user interface that simplifies provisioning containers and persistent storage. In our testing, we found that Kubernetes plus the CSI Driver for Dell EMC VxFlex OS enabled developers to provision containerized applications with persistent storage. This solution features point-and-click simplicity and frees valuable time so that the storage administrator can focus on business-critical tasks.