Home > Storage > PowerFlex > White Papers > Deploying PostgreSQL on Dell PowerFlex > PowerFlex snapshot as backup and recovery solution
PowerFlex snapshots are consistent block images in the form of addressable target storage volumes that capture the state of their source volumes at a specific point in time. They are thin-provisioned (do not take any additional storage capacity when created), and writable, unless you select the Secure Snapshot or Read Only options. Once created, a snapshot becomes a new volume that can be managed like any other volume exposed by the storage system. When the snapshot is created with secure snapshots option, it ensures data retention compliance by preventing alteration or deletion until a set expiration time.
Snapshots can be managed manually through the PowerFlex UI, CLI, or REST API, with automated policies available for creation, retention, and deletion according to predefined schedules. Snapshots and their source volume are organized into a Volume Tree (V-Tree). The V-Tree includes the root volume and all descendant snapshots resulting from that volume. A V-Tree includes not only snapshots taken of the root volume at different points in time, but also descendants that are snapshots of snapshots. For example, it is a common practice to create a snapshot of a production database, mask any proprietary or private data, then create additional snapshots and provision them to groups such as Development and QA.
For more information about PowerFlex snapshots and how to use them, see the Dell PowerFlex: Snapshots Technical White Paper.
Note: Snapshots don’t protect against hardware and storage system failures, as the snapshots are stored on the same storage system as the original data. To protect against such failures, remote replication and/or other backup strategies should be implemented by the organization to realize maximum data protection.
PowerFlex snapshots support the notion of data consistency or consistency group. It allows you to take a snapshot of single or multiple volumes that can be selected simultaneously. All snapshots taken when you select multiple volumes this way form a consistency group. The snapshots in a consistency group are guaranteed to be from precisely the same point in time. They can be used to capture a crash-consistent (or storage-consistent) image across multiple volumes, making it an efficient way to create database copies.
To ensure that the database can restart from that snapshot image, the snapshot must include all the volumes holding the database datafiles and transaction log files. Once this snapshot is mapped to the target host, the PostgreSQL service pointing to the snapshot data can simply be started. As a result, the database copy has all committed transactions accurately to the time the snapshot was created. When a snapshot is taken in coordination with the PostgreSQL database APIs, it can also serve for database backup and recovery, as demonstrated later in this paper.
PowerFlex does not force the notion of a consistency group after it is created. It is the responsibility of the user to continue to treat the volumes as a group for operations such as mounting them to a server, deleting them, creating additional snapshots from these volumes, and so on. To help the user with the notion of a group, a user-friendly alias can be provided while creating and restoring the snapshot.
PostgreSQL provides several options for backups such as its built-in tool (pgdump or pg_basebackup), operating system level options or other third-party backup tools that use database data dump or file copy. However, as databases size grow, a snapshot-based backup becomes more efficient compared to a logical backup like pg_dump, due to the long time and performance overhead incurred while copying large amounts of data. Storage-based-snapshots, when following the procedure to create valid database backups, take seconds to create, regardless of the database size. In addition, they offer faster recovery as their data is immediately available and does not need to be copied back. In addition, storage-based snapshot backups provide the ability to recover individual objects by mounting the snapshot image to a new location and observing its data or exporting the required components.
Database administrators must assess transaction frequency, database size, change rate, application criticality, space constraints, and risk tolerance to plan for effective restoration and contingency measures.
In this section following use cases are described:
Note: The use cases are also applicable to virtualized environments, although some additional steps may be required to manage the datastores in the hypervisor.
Each compute node is mapped with PowerFlex storage volumes, where one volume is used for the datafiles, one for transaction logs, and one for archiving the write ahead log (WAL) files. The PowerFlex volumes are seen as /dev/scini* devices. They are formatted as XFS filesystems and are mounted to their respective mount points with appropriate permissions. PostgreSQL is installed, services are run, the database is initialized, and loaded with data to simulate a customer workload.
postgres=# SELECT pg_size_pretty(sum(pg_database_size(datname))) AS total_size
FROM pg_database;
total_size
------------
135 GB
(1 row)