Using Dell PowerFlex and Google Distributed Cloud Virtual for Postgres Databases and How to Protect Them
Fri, 03 Nov 2023 23:27:04 -0000
|Read Time: 0 minutes
Did you know you can get the Google Cloud experience in your data center? Well now, you can! Using Google Distributed Cloud (GDC) Virtual and Dell PowerFlex enables the use of cloud and container workloads – such as Postgres databases – in your data center.
Looking beyond day one operations, the whole lifecycle must be considered, which includes assessing how to protect these cloud native workloads. That’s where Dell PowerProtect Data Manager comes in, allowing you to protect your workloads both in the data center and the cloud. PowerProtect Data Manager enhances data protection by discovering, managing, and sending data directly to the Dell PowerProtect DD series virtual appliance, resulting in unmatched efficiency, deduplication, performance, and scalability. Together with PowerProtect Data Manager, the PowerProtect DD is the ultimate cyber resilient data protection appliance.
In the following blog, we will unpack all this and more, giving you the opportunity to see how Dell PowerFlex and GDC Virtual can transform how you cloud.
What is Google Distributed Cloud Virtual?
We will start by looking at GDC Virtual and how it allows you to consume the cloud on your terms.
GDC Virtual provides you with a consistent platform for building and managing containerized applications across hybrid infrastructures and helps your developers become more productive across all environments. GDC Virtual provides all the mechanisms required to bring your code into production reliably, securely, and consistently while minimizing risk. GDC Virtual is built on open-source technologies pioneered by Google Cloud including Kubernetes and Istio, enabling consistency between cloud and on premises environments like PowerFlex. Anthos GKE (on GCP and on-prem), Anthos Service Mesh, and Anthos Config Management are the core building blocks of Anthos, which has integrations with platform-level services such as Stackdriver, Cloud Build, and Binary Authorization. GDC Virtual users purchase services and resources from the GCP Marketplace.
Figure 1. GDC Virtual components.
GDC Virtual puts all your IT resources into a consistent development, management, and control framework, automating low-value tasks across your PowerFlex and GCP infrastructure.
Within the context of GCP, the term ‘hybrid cloud’ describes a setup in which common or interconnected services are deployed across multiple computing environments, which include public cloud and on-premises. A hybrid cloud strategy allows you to extend the capacity and capabilities of your IT without the upfront capital expense investments of the public cloud while preserving your existing investments by adding one or more cloud deployments to your existing infrastructure. For more information, see Hybrid and Multi-Cloud Architecture Patterns.
PowerFlex delivers software defined storage to both virtual environments and bare metal hosts providing flexible consumption or resources. This enables both two-tier and three-tier architectures to match the needs of most any environment.
PowerFlex container storage
From the PowerFlex UI – shown in the following figure – you can easily monitor the performance and usage of your PowerFlex environment. Additionally, PowerFlex offers a container storage interface (CSI) and container storage modules (CSM) for integration with your container environment. The CSI/CSM allows containers to have persistent storage, which is important when working with workloads like databases that require it.
Figure 2. PowerFlex dashboard provides easy access to information.
To gain a deeper understanding of implementing GDC Virtual on Dell Powerflex, we invite you to explore our recently published reference architecture.
Dell engineers have recently prepared a PostgreSQL container environment deployed from the Google Cloud to a PowerFlex environment with GDC Virtual in anticipation of Kubecon. For those who have deployed Postgres from Google Cloud, you know it doesn’t take long to deploy. It took our team maybe 10 minutes, which makes it effortless to consume and integrate into workloads.
Once we had Postgres deployed, we proceeded to put it under load as we added records to it. To do this, we used pgbench, which is a built-in benchmarking tool in Postgres. This made it easy to fill a database with 10 million entries. We then used pgbench to simulate the load of 40 clients running 40 threads against the freshly loaded database.
Our goal wasn’t to capture performance numbers though. We just wanted to get a “warm” database created for some data protection work. That being said, what we saw on our modest cluster was impressive, with sub-millisecond latency and plenty of IO.
Data protection
With our containerized database warmed up, it was time to protect it. As you probably know, there are many ways to do this, some better than others. We’ll spend just a moment talking about two functional methods of data protection – crash consistent and application consistent backups. PowerProtect Data Manager supports both crash-consistent and application consistent database backups.
A “crash consistent” backup is exactly as the name implies. The backup application captures the volume in its running state and copies out the data regardless of what’s currently happening. It’s as if someone had just pulled the power cord on the workload. Needless to say, that’s not the most desirable backup state, but it’s still better than no backup at all.
That’s where an “application consistent” backup can be more desirable. An application consistent backup talks with the application and makes sure the data is all “flushed” and in a “clean” state prior to it being backed up. At least, that’s the simple version.
The longer version is that the backup application talks to the OS and application, asks them to flush their buffers – known as quiescing – and then triggers a snapshot of the volumes to be backed up. Once complete, the system then initiates a snapshot on the underlying storage – in this case PowerFlex – of the volumes used. Once the snapshots are completed, the application-level snapshots are released, the applications begin writing normally to it again, and the backup application begins to copy the storage snapshot to the protected location. All of this happens in a matter of seconds, many times even faster.
This is why application consistent backups are preferred. The backup can take about the same amount of time to run, but the data is in a known good state, which makes the chances of recovery much greater than crash consistent backups.
In our lab environment, we did this with PowerProtect Data Manager and PowerProtect DD Virtual Edition (DDVE). PowerProtect Data Manager provides a standardized way to quiesce a supported database, backup the data from that database, and then return the database to operation. This works great for protecting Kubernetes workloads running on PowerFlex. It’s able to create application consistent backups of the Postgres containers quickly and efficiently. This also works in concert with GDC Virtual, allowing for the containers to be registered and restored into the cloud environment.
Figure 3. An application consistent backup and its timing in the PowerProtect Data Manager UI
It’s great having application consistent backups of your cloud workloads, “checking” many of those boxes that people require from their backup environments. That said, just as important and not to be forgotten is the recovery of the backups.
Data recovery
As has been said many times, “never trust a backup that hasn’t been tested.” It’s important to test any and all backups to make sure they can be recovered. Testing the recovery of a Postgres database running in GDC Virtual on PowerFlex is as straightforward as can be.
The high-level steps are:
- From the PowerProtect Data Manager UI, select Restore > Assets, and select the Kubernetes tab. Select the checkbox next to the protected namespace and click Restore.
- On the Select Copy page, select the copy you wish to restore from.
- On the Restore Type page, select where it should be restored to.
- Determine how the Persistent Volume Claims (PVCs) and namespace should be restored.
- When finished, test the restore.
You might have noticed in step 4, I mentioned PVCs, which are the container’s connections to the data and, as the name implies, allow that data to persist across the nodes. This is made possible by the CSI/CSM mentioned earlier. Because of the integration across the environment, restoring PVCs is a simple task.
The following shows some of the recovery options in PowerProtect Data Manager for PVCs.
Figure 4. PowerProtect Data Manager UI – Namespace restore options
The recovery, like most things in data protection, is relatively anticlimactic. Everything is functional, and queries work as expected against the Postgres database instance.
Dell and Google Cloud collaborated extensively to create solutions that leverage both PowerFlex and GDC Virtual. The power of this collaboration really shows through when recovery operations just work. That consistency and ease enables customers to take advantage of a robust environment backed by leaders in the space and helps to remove one nightmare that keeps developers and IT admins awake at night, allowing them to rest easy and be prepared to change the world.
If any of this sounds interesting to you and you’ll be at Kubecon in Chicago, Illinois on November 6-9, stop by the Google Cloud booth. We’ll be happy to show you demos of this exciting collaboration in action. Otherwise, feel free contact your Dell representative for more details.
Resources
Authors:
Authors: | Tony Foster, | Vinod Kumar Kumaresan, | Harsha Yadappanavar, |
LinkedIn: | |||
X (formerly Twitter): |
| @harshauy | |
Personal Blog: |
|
|
Related Blog Posts
KubeCon NA23, Google Cloud Anthos on Dell PowerFlex and More
Sun, 05 Nov 2023 23:26:43 -0000
|Read Time: 0 minutes
KubeCon will be here before you know it. There are so many exciting things to see and do. While you are making your plans, be sure to add a few things that will make things easier for you at the conference and afterwards.
Before we get into those things, did you know that the Google Cloud team and the Dell PowerFlex team have been collaborating? Recently Dell and Google Cloud published a reference architecture: Google Cloud Anthos and GDC Virtual on Dell PowerFlex. This illustrates how both teams are working together to enable consistency between cloud and on premises environments like PowerFlex. You will see this collaboration at KubeCon this year.
On Tuesday at KubeCon, after breakfast and the keynote, you should make your way to the Solutions Showcase in Hall F on Level 3 of the West building. Once there, make your way over to the Google Cloud booth and visit with the team! They want your questions about PowerFlex and are eager to share with you how Google Distributed Cloud (GDC) Virtual with PowerFlex provides a powerful on-premises container solution.
Also, be sure to catch the lightning sessions in the Google Cloud booth. You’ll get to hear from Dell PowerFlex engineer, Praphul Krottapalli. He will be digging into leveraging GDC Virtual on PowerFlex. That’s not the big thing though, he’ll also be looking at running a Postgres database distributed across on-premises PowerFlex nodes using GDC Virtual. Beyond that, they will look at how to protect these containerized database workloads. They’ll show you how to use Dell PowerProtect Data Manager to create application consistent backups of a containerized Postgres database instance.
We all know backups are only good if you can restore them. So, Praphul will show you how to recover the Postgres database and have it running again in no time.
Application consistency is an important thing to keep in mind with backups. Would you rather have a database backup where someone had just pulled the plug on the database (crash consistent) or would you like the backup to be as though someone had gracefully shut down the system (application consistent)? For all kinds of reasons (time, cost, sanity), the latter is highly preferable!
We talk about this more in a blog that covers the demo environment we used for KubeCon.
This highlights Dell and Google’s joint commitment to modern apps by ensuring that they can be run everywhere and that organizations can easily develop and deploy modern workloads.
If you are at KubeCon and would like to learn more about how containers work on Dell solutions, be sure to stop by both the Dell and Google Cloud booths. If it’s after KubeCon, be sure to reach out to your Dell representative for more details.
Author: Tony Foster
Data Protection for Virtualized Environments: Made Simpler with PowerProtect Data Manager
Wed, 24 Apr 2024 11:27:28 -0000
|Read Time: 0 minutes
VMware is arguably the leader in enterprise virtualization today. Used by countless enterprises to host their business-critical workloads, it has great features and capabilities across the board when it comes to manageability, scalability, and high-availability.
While VMware vSphere features great high-availability capabilities to your customers and business stakeholders, you still must protect your data from disaster. At the heart of any good disaster recovery plan is effective backups. When it comes to protecting virtual machines, there are best practices you want to follow when backing up your data.
For most businesses, when it comes to virtual machines (VMs), any kind of software or hardware failure typically results in some degree of financial loss when there are service interruptions. Actions such as granular restoration and full disaster recovery can also be especially useful in specific use cases. And you can perform all of these data recovery techniques (and more) with a proper VMware virtual machine backup solution.
PowerProtect Data Manager for a simpler way to back up virtualized environments
Dell PowerProtect Data Manager provides software-defined data protection, automated discovery, deduplication, operational agility, self-service, and IT governance for physical, virtual, and cloud environments.
- Orchestrate protection directly through an intuitive interface or empower data owners to perform self-service backup and restore operations from their native applications
- Ensure compliance and meet even the strictest of service level objectives
- Leverage existing Dell PowerProtect appliances
PowerProtect Data Manager provides a consistent protection experience for VMware environments. Data Manager is the only solution to provide native vSphere integration with vCenter for VM protection, allowing storage and backup admins, as well as VM owners, to choose a storage policy to apply to each VM automatically when it is instantiated.
PowerProtect Data Manager integration with VMs helps to manage, protect, and reuse VM data across the enterprise by deploying services to accomplish the following tasks:
- Discover, access, and recover VM copies non-disruptively across primary and protection storage without introducing new infrastructure or complexity
- Automate efficient copy creation
- Efficiently automate data retention SLA compliance, ensuring that the right number of copies are stored in the right place at the right level of protection
- Optimize operations based on actionable analytics and insight
Virtual Machine (VM) backup options
- VM Consistent Backup:
Capture all the virtual machine disks at the same time and back up the data to storage targets to create a transactional-consistent backup. This option for can be used for Windows and Linux VMs, and for guest operating systems that have applications other than the SQL Server.
- Application-Aware Backup:
Application aware full backup is an extension of VM full backup. For VMs with a SQL application installed, select this type to quiesce the application to perform the SQL database and transaction log backup. When you select this type, you must provide Windows account credentials for the VM.
VM Direct Engine is the protection appliance within PowerProtect Data Manager that allows you to manage and protect VM assets. The PowerProtect Data Manager software comes pre-bundled with an embedded VM Direct engine and allows you to deploy additional external VM Direct Engines.
VM Direct protection engine
VM Direct Engine is deployed in the vSphere environment to perform virtual machine snapshot backups. VM Direct Engine improves performance and reduces network bandwidth utilization by using PowerProtect DD series appliance source-side deduplication.
Basically, there are two types of VM Direct Engines: Internal and External.
Internal VM Direct Protection Engine:
- Protection using Transparent Snapshot Data Mover (TSDM)
- For small scale environments
- Does not require concurrent backups
- Uses Network Block Device transport mode
- Data is transferred over the network to the ESXi server that is hosting the proxy
- Proxy gets data from its ESXi host and writes to storage
External VM Direct Protection Engine
- For larger scale environments
- Requires concurrent data protection operations
- Can also use NBD but the preferred method is to use Hot-Add transport mode for better performance
- Proxy attaches itself to VM disk snapshots to be backed-up
- Proxy reads data from the attached disk and writes to storage
Note: When you deploy and register an additional VM Direct Engine, PowerProtect Data Manager uses this appliance instead of the embedded VM Direct Engine. If the added VM Direct Engine is not available, the embedded VM Direct Engine is used to ensure that backups complete successfully.
Depending on the vSphere version and the VM protection policy options selected, VM Direct can protect VMs by using traditional VMware APIs for Data Protection (VADP) snapshots, or by using the Transparent Snapshot Data Mover (TSDM) protection mechanism introduced in PowerProtect Data Manager version 19.9.
An external VM Direct Engine is not required for VM protection policies that use the Transparent Snapshot Data Mover (TSDM) protection mechanism. For these policies, the embedded VM Direct Engine is sufficient.
vStorage API for Data Protection (VADP) snapshots
PowerProtect Data Manager supports Change Block Tracking (CBT), which allows backup applications to determine the delta of changes in the VM since the last backup, and only read and transfer those changes when doing the next backup incrementally. This allows PowerProtect Data Manager to use one or more VM proxies to read a VM's disk changes and to transfer them.
Any L0 backup of a VM reads the entire contents of all disks and writes the same to storage using DD Boost (leveraging global deduplication). Any non-L0 backup of a CBT enabled VM will only read changes in the disks from the last backup and overlay those changes on a copy of the last backup to generate a new full backup (while moving only incremental changes).
Backup files are written to storage using fixed size segments (FSS) of 8K. Backup files on storage are always thick, that is, VMDK file-size on storage is equal to the size of the provisioned disk.
Transparent Snapshot Data Mover (TSDM)
TSDM is a protection mechanism in PowerProtect Data Manager and was designed to replace the VMware vStorage API for Data Protection (VADP) protection mechanism for crash-consistent VM protection.
The advantages of using the TSDM protection mechanism for VM data protection include the following:
- Eliminates the latency and performance impact on the production virtual machine during the protection policy life cycle
- Reduces the CPU, storage, and memory consumption required for backups. After the initial full backup, only incremental backups using the immediate previous snapshot are performed.
- An external VM Direct engine is not required. The VM Direct engine embedded with PowerProtect Data Manager is sufficient.
- Automatic scaling
Transparent snapshots architecture
For more details, see the technical white paper VMware Virtual Machine Protection using Transparent Snapshots.
Note: PowerProtect Data Manager manages the TSDM component by using the VIB (VMware Certified) from Dell Technologies. This component is installed dynamically as part of the integration of PowerProtect Data Manager that requires protection of VMs using Transparent Snapshots. The APIs being used are supported in VMware ESXi 7.0 U3 and later.
Transport modes used in PowerProtect Data Manager
PowerProtect Data Manager supports the HotAdd and Network Block Device (NBD) transport modes. You select a transport mode when adding the vProxy appliance (Hot-Add, NBD, or the default settings Hot Add, Failback to Network Block Device).
In NBD mode, the ESX/ESXi host reads data from storage and sends it across a network to the target storage. As its name implies, this transport mode is not LAN‐free, unlike SAN transport.
HotAdd is a VMware feature in which devices can be added “hot” while a VM is running. Besides SCSI disk, VMs can add additional CPUs and memory capacity. If backup software runs in a virtual appliance, it can take a snapshot and create a linked clone of the target virtual machine, then attach and read the linked clone’s virtual disks for backup. This involves a SCSI HotAdd on the ESXi host where the target VM and backup proxy are running. Virtual disks of the linked clone are Hot Added to the backup proxy. The target VM continues to run during backup.
PowerProtect plug-in for the vSphere Client
If you want to make a subset of the PowerProtect Data Manager UI functionality available within the vSphere Client, select vSphere Plugin. When adding a vCenter server as an asset source in the PowerProtect Data Manager UI, if you enable the vSphere Plugin option, a pane for PowerProtect appears in the vSphere Client.
This pane provides a subset of PowerProtect Data Manager functionality, including the option to perform a manual backup, image-level restore, and a file-level restore of PowerProtect Data Manager VM protection policies.
PowerProtect Search Engine
PowerProtect Search Engine is installed by default when PowerProtect Data Manager is installed. The PowerProtect Search Engine indexes virtual machine file metadata to enable searches based on configurable parameters. When the indexing is added to protection policies, the assets are indexed as they are being backed up.
You can add up to five search nodes starting from PowerProtect Data Manager v19.5. You can index up to 1000 assets (VMs) per node.
Supported use cases:
- Searching for files from their respective copies across VMs of different kinds (Windows and Linux)
- Searching for files from their respective backup copies for similar kinds of VMs
- Searching for files and trying to restore the file from its various backup copies at different Points-In Time (PITs)
Supported enhanced VMware topologies for VM protection
PowerProtect Data Manager provides protection for clustered ESXi server storage, networking, and enterprise management. Understanding which topologies are supported in these environments can help you design your network infrastructure.
Supported enhanced topologies
Supported topologies of clustered ESXi server storage, networking, and enterprise management include the following:
- vSAN operations
- NSX-T port groups
- Enhanced Link Mode vCenter servers
After virtual assets are backed up as part of a virtual machine protection policy in the PowerProtect Data Manager UI, you can perform image-level and file-level recoveries from individual or multiple VM backups, and restore individual virtual machine disks (VMDKs) to their original location.
Virtual Machine Restore Options
After virtual assets are backed up as part of a virtual machine protection policy in the PowerProtect Data Manager UI, users can perform image-level and file-level recoveries from individual or multiple virtual machine backups, and also restore individual virtual machine disks (VMDKs) to their original location. Below are the different types of recoveries that can be performed, according to customer requirements.
- Restore to a new virtual machine — A Create and Restore to New VM enables you to create a new VM using a copy of the original VM backup. Other than having a new name or location and a new vSphere VM Instance UUID, this copy is an exact replica of the VM that you backed up with the protection policy in PowerProtect Data Manager.
Note: Full SQL-database and transaction log restores of a virtual machine from application aware virtual machine protection policies must be performed using Microsoft application agent tools.
- Restore individual virtual disks — A Restore Individual Virtual Disks operation recovers individual virtual disks (VMDKs) to their original location on the vCenter server, rolling the VMDKs that you backed up with the protection policy in PowerProtect Data Manager to an earlier point in time.
- Restore to the original virtual machine — A Restore to Original VM operation recovers a VM backup to its original location on the vCenter server. This operation rolls the VMs that you backed up with the protection policy in PowerProtect Data Manager to an earlier point in time.
Note: Starting with PowerProtect Data Manager version 19.10, you can restore the VM configuration during a Restore to Original VM. If there were changes to the VM disk configuration, you cannot clear this option.
- Self-service restores — A PowerProtect Data Manager system or security administrator can enable users to perform self-service restores of their own assets without further administrator intervention. Self-service restores require a scope of authority that includes the Restore Administrator role for the relevant user assets or asset sources.
- Direct restore to ESXi — If the VM you protected with PowerProtect Data Manager was a vCenter VM, but the VM and vCenter server are now lost or no longer available, direct restore to ESXi enables you to recover the VM directly to an ESXi host without a vCenter server.
Note: Direct Restore to ESXi restore requires either the embedded VM Direct Engine with PowerProtect Data Manager, or an external VM Direct appliance that is added and registered to PowerProtect Data Manager. Additionally, ensure that you disconnect the ESXi host from the vCenter server.
- Instant Access VM restore — An Instant Access VM restore enables you to create a new VM directly from the original VM backup on protection storage for instant backup validation and recovery of individual files. The instant access VM is initially available for seven days. This process does not copy or move any data from protection storage to the production datastore. An instant access VM restore also provides the option to move the VM to a production datastore when you want to retain access to the VM for a longer time.
Data and workloads have evolved over time, leading to a multi-generational data sprawl that introduces multiple risks to your business. Collectively, this set of risks form a business integrity gap: the gap between where organizations’ data environments are today, hampered by the challenges of data sprawl, and where their data environments should be in-order to thrive, accelerate, and digitally transform.
PowerProtect Data Manager helps you meet these challenges with its Intelligent Data Services Platform, closing the business integrity gap and enabling organizations to accelerate business growth. With its Intelligent Data Services Platform, PowerProtect Data Manager delivers a flexible, future-proof architecture.
PowerProtect Data Manager provides a single, robust VMware backup solution. Why settle for multiple point solutions for backup and recovery, disaster recovery and test/development when you can deploy PowerProtect Data Manager to provide these capabilities in a single solution?
For more details on PowerProtect Data Manager Virtualized Environment protection, see the white paper Dell PowerProtect Data Manager: Virtual Machine Protection and visit the Dell PowerProtect Data Manager web site.
Author: Chetan Padhy, Senior Engineering Technologist, Data Protection Division