blogs (18)

powerprotect
data manager
ppdm
kubernetes
k8s
csi volume
snapshot rwx
readwritemany
vsan
fs

PowerProtect Data Manager – Kubernetes data protection for CSI volumes without CSI snapshots

Idan Kentor

Fri, 12 Jan 2024 18:58:56 -0000

Read Time: 0 minutes

In this blog, we’re continuing the tradition of great (and humble 😊) PowerProtect Data Manager and k8s blogs. Other than that, in this blog, I want to introduce you to a groundbreaking new capability we’ve added in the recent PowerProtect Data Manager 19.15 release which is the ability to protect CSI (Container Storage Interface) volumes without CSI snapshots.

This new ability is advantageous mostly in cases where the CSI does not support snapshots. In other words, we are enabling the protection of CSI volumes which was otherwise difficult to protect.

So, let’s take a closer look at this new capability.

Protection of non-snapshot CSI PVCs – what does it do ?

This feature enables protection for CSI PVCs (Persistent Volumes Claims) without the use of CSI snapshots. This introduces a way to backup volumes which were provisioned by a CSI driver without support for snapshots. Refer to this list of available CSI drivers and their snapshot support status.

What are the challenges this capability comes to solve ?

The main use case this newly added capability comes to solve is backup of CSI network file shares (or NFS) that have no snapshotting capability with their CSI driver. vSAN File Services (vSAN-FS) is a prime example of that. Now PowerProtect Data Manager can protect RWX PVCs (volumes which were provisioned with the Read Write Many volume access mode) which are quite popular for many use cases and also prevalent for NFS CSI drivers.

Another challenge worth talking about is CSI snapshots, there are cases where even if the CSI driver supports snapshots, there are certain inefficiencies that are mostly related to volume cloning and are storage platform dependent. Therefore, another advantage of PPDM’s ability to backup CSI volumes without the use of CSI snapshots is that it is not tied to a specific storage platform.

How does this feature work ?

This backup of non-snapshot CSI PVCs feature is an opt-in feature meaning that the user can choose which storage class this feature would apply for; but by default, PPDM would opt to use CSI snapshots as the primary data path.

For protection of workloads which use PVCs provisioned on storage classes to be used for non-snapshot backups, the data mover pod (AKA cProxy pod) is updated with topology specifications so that it would run on the same k8s worker node as the original pod(s).

The cProxy pod mounts the PVCs in read-only mode without detaching or impacting the user application volume. This enables the feature to support the Read Write Once (RWO) volume access mode but other \access modes such as Read Write Many (RWX) and Read Only Many (ROX) are supported as well for protection of CSI PVCs without CSI snapshot.

OK, how do I configure backup of non-snapshot PVCs ?

The first step is to edit the k8s asset source and under advanced options, add the controller configuration key/value pair:

Configuration Key: k8s.ppdm.csi.nonsnapshot.storageclasses

Supported Value: Comma-separated list of non-snapshot CSI storage classes

Naturally, this can be configured when the k8s cluster is added as an asset source for the first time.

Afterwards, we just need to protect the workload by creating a protection policy and running a backup. I won’t spend much time talking about the protection process as it’s pretty straightforward, but I will include a nice little diagram to illustrate the flow here:

The protection job details include list of PVCs which were backed up without CSI snapshots under the NonSnapshotPvcs field.

Caveats and recommendations

So, I thought it would be helpful to talk about few important caveats and recommendations.

The ability to protect CSI PVCs without CSI snapshots performs a backup of a live file system which means that data may not be captured at the same exact point-in-time as the CSI snapshot approach.
As of PPDM 19.15, every backup of non-snapshot CSI PVCs is a full backup of the live volume. Open files would be skipped, detected, and logged by the controller so that they will be included in the next backup. The number of skipped files is shown in the PPDM UI under Jobs -> Protection Jobs for job type protect. The file paths of the skipped files appear in the controller logs (the powerprotect-controller pod running on the powerprotect namespace), which are pulled into the /logs/external-components/k8s directory on the PPDM appliance.
The cProxy (data mover) pod is running in the user namespace as part of the backup flow so there is a need to make sure that the PPDM service account can create and delete secrets in that namespace. The PPDM RBAC YAML files can help with creating the required service account for PPDM discovery and operations. The rbac.tar.gz can be retrieved in one of the following ways:
1. You can download the archive from the PowerProtect Data Manager UI by navigating to this location: Settings > Downloads > Kubernetes > RBAC
2. Retrieve the rbac.tar.gz file from the PPDM appliance at the following location: /usr/local/brs/lib/cndm/misc/rbac.tar.gz

Note that the there is no requirement to provide root access to the host file system.

vSphere CSI considerations – The vSphere CSI decides whether to provision PVCs from block storage or vSAN-FS (vSAN File Services) based on volume access mode therefore the recommendation is to separate storage classes for block and vSAN-FS. PPDM would automatically use non-snapshot on configured storage classes for PVCs with access mode of Read Only Many (ROX) and Read Write Many (RWX) as these result in PVC being provisioned on vSAN-FS.

Contrarily, PPDM would automatically backup PVCs that are provisioned using Read Write Once (RWO) with the optimized data path for VMware First Class Disks (or FCDs). So, even if there are PVCs provisioned on vSAN-FS and FCD on the same storage class, PPDM has the intelligence to trigger the most suitable data path granted that storage class is configured for non-snapshots as per the configuration we talked about earlier.

Resources

Always remember - documentation is your friend! The PowerProtect Data Manager Kubernetes User Guide has some useful information for any PPDM with K8s deployment. Furthermore, make sure to check out the PowerProtect Data Manager Compatibility Matrix.

Thanks for reading, and feel free to reach out with any questions or comments.

Idan

Read Full Blog

PowerProtect Data Manager
automation
DevOps
Lifecycle Management

PowerProtect Data Manager Automation – Lifecycle Management

Idan Kentor

Mon, 20 Nov 2023 15:42:26 -0000

Read Time: 0 minutes

In this installment of the PowerProtect Data Manager (PPDM) automation series of blogs, we will focus on a new solution that automates PowerProtect Data Manager life cycle management.

For PPDM automation, we have auto-policy creation and ad-hoc VM backup solutions, use-case driven tasks, complete PPDM deployment automation, and so on - all available in the official PowerProtect Data Manager GitHub repository. And now, I am proud to present to you the PPDM life cycle automation solution.

So, let’s take a closer look at the solution.

What does the solution do?

The PowerProtect Data Manager automated life cycle management automates PPDM upgrades with a wide variety of options:

Upgrade with an upload package
Performs pre-upgrade checks
Continuously monitors pre-upgrade checks and upgrade processes
Includes an option to skip the upload of the upgrade package and use the existing upgrade package
Can perform only the pre-checks, without performing the actual upgrade
Supports monitoring only phase where a current upgrade can be monitored
Allows to skip PPDM VM snapshot. In any case, that snapshot will not be taken if a hosting vCenter is not configured

What is the solution?

It is a Python-based script that operates with the PPDM REST API.

Here is the list of prerequisites:

Python 3.x (The script supports every platform Python is supported on)
Python requests module, which can be installed using pip with the command: “pip install requests” or “python -m pip install requests”
PowerProtect Data Manager 19.13 and later
Connectivity from the host running the script to PPDM, specifically on tcp ports 8443 and 14443

How do I use the script?

The script accepts the following parameters:

The PPDM host is represented by the mandatory parameter server and username (defaults to user admin) and a mandatory password parameter
The parameter file for a full path to the upgrade package or, alternatively the skipupload parameter with the release parameter to determine the PPDM version to be applied
The parameter onlyprecheck allows to perform precheck and exit without performing the actual upgrade
Specify skipsnapshot to prevent VM snapshot from being taken on the PPDM VM
The onlymonitor parameter performs monitoring of active upgrades

Here is the full script syntax:

# python ppdm_upgrade.py -h

usage: ppdm_upgrade.py [-h] -s SERVER [-u USERNAME] -p PASSWORD [-f UPGFILE] [-onlyprecheck] [-skipupload]

[-release PPDMRELEASE] [-skipsnapshot] [-onlymonitor]

Script to automate PowerProtect Data Manager lifecycle management

options:

-h, --help show this help message and exit

-s SERVER, --server SERVER

PPDM server FQDN or IP

-u USERNAME, --username USERNAME

Optionally provide the PPDM username

-p PASSWORD, --password PASSWORD

PPDM password

-f UPGFILE, --file UPGFILE

Full path to upgrade package

-onlyprecheck, --only-pre-check

Optionally stops after pre-check

-skipupload, --skip-file-upload

Optionally skips file upload

-release PPDMRELEASE, --ppdm-release PPDMRELEASE

Provide PPDM version if skipping package upload

-skipsnapshot, --skip-snapshot

Optionally skips PPDM VM snapshot

-onlymonitor, --only-monitor

Optionally only monitor running upgrade

Use Cases and Examples

Let’s look at some common use cases for automated PPDM life cycle management:

1. PPDM automated upgrade, including file upload:

# python ppdm_upgrade.py -s 10.0.0.1 -p "myTempPwd!" -f /home/idan/dellemc-ppdm-upgrade-sw-19.14.0-27.pkg

2. For cases where there is a need to prepare for an upgrade by uploading the package and run the precheck. It is possible to perform the automated upgrade in two phases.

a. First, only file upload and precheck:

# python ppdm_upgrade.py -s 10.0.0.1 -p "myTempPwd!" -f /home/idan/dellemc-ppdm-upgrade-sw-19.14.0-27.pkg -onlyprecheck

b. Second, perform the upgrade itself:

# python ppdm_upgrade.py -s 10.0.0.1 -p "myTempPwd!" -skipupload -release 19.14.0-27

3. Monitoring a running upgrade from a different workstation:

# python ppdm_upgrade.py -s 10.0.0.1 -p "myTempPwd!" -onlymonitor

Script output

# python ppdm_upgrade.py -s 10.0.0.1 -p "myTempPwd!" -f /home/idan/dellemc-ppdm-upgrade-sw-19.14.0-27.pkg

-> Obtaining PPDM configuration information

---> PPDM is upgrade ready

-> Performing pre-upgrade version checks

---> Current PPDM version: 19.13.0-20

---> Checking upgrade to PPDM version: 19.14.0-27

-> Uploading PPDM upgrade package

---> Upload completed successfully in 3 mins and 34 secs

-> Monitoring upgrade ID 636cc6a3-2e84-4fb8-bb40-87aefd0f7b96

---> Monitoring state AVAILABLE

-> Performing pre-upgrade checks

-> Monitoring upgrade ID 636cc6a3-2e84-4fb8-bb40-87aefd0f7b96

---> Monitoring state PROCESSING

---> Monitoring state AVAILABLE

-> Upgrading PPDM to release 19.14.0-27

---> Monitoring PPDM upgrade

---> Upgrade status: PENDING

---> Upgrade status: RUNNING 3%

----> Upgrade info: current component: eCDM, description: Taking Update Snapshot 3%

----> Upgrade info: seconds elapsed / remaining: 11 / 2477

---> Upgrade status: RUNNING 3%

----> Upgrade info: current component: eCDM, description: Taking Update Snapshot 3%

----> Upgrade info: seconds elapsed / remaining: 41 / 2447

---> Upgrade status: RUNNING 3%

----> Upgrade info: current component: eCDM, description: Taking Update Snapshot 3%

----> Upgrade info: seconds elapsed / remaining: 53 / 2435

---> Upgrade status: RUNNING 3%

----> Upgrade info: current component: eCDM, description: Taking Update Snapshot 3%

----> Upgrade info: seconds elapsed / remaining: 64 / 2424

---> Upgrade status: RUNNING 10%

----> Upgrade info: current component: eCDM, description: Shutting Down Components 10%

----> Upgrade info: seconds elapsed / remaining: 211 / 2004

---> Upgrade status: RUNNING 10%

----> Upgrade info: current component: eCDM, description: Shutting Down Components 10%

----> Upgrade info: seconds elapsed / remaining: 222 / 1993

----> Upgrade info: seconds elapsed / remaining: 266 / 1949

---> Upgrade status: RUNNING 21%

----> Upgrade info: current component: eCDM, description: Updating The RPMs 21%

----> Upgrade info: seconds elapsed / remaining: 277 / 1720

---> Upgrade status: RUNNING 21%

----> Upgrade info: current component: eCDM, description: Updating The RPMs 21%

----> Upgrade info: seconds elapsed / remaining: 288 / 1709

---> Upgrade status: RUNNING 21%

----> Upgrade info: current component: eCDM, description: Updating The RPMs 21%

----> Upgrade info: seconds elapsed / remaining: 299 / 1698

---> Upgrade status: RUNNING 21%

----> Upgrade info: current component: eCDM, description: Updating The RPMs 21%

----> Upgrade info: seconds elapsed / remaining: 563 / 1486

---> Polling timed out, retrying...

---> Upgrade status: RUNNING 32%

----> Upgrade info: current component: eCDM, description: Starting Components 32%

----> Upgrade info: seconds elapsed / remaining: 688 / 1167

---> Upgrade status: RUNNING 35%

----> Upgrade info: current component: eCDM, description: Migrating Data 50%

----> Upgrade info: seconds elapsed / remaining: 776 / 1038

---> Upgrade status: RUNNING 52%

----> Upgrade info: current component: eCDM, description: Data Migration Completed 52%

----> Upgrade info: seconds elapsed / remaining: 787 / 1038

---> Upgrade status: RUNNING 55%

----> Upgrade info: current component: eCDM, description: Data Migration Completed 55%

----> Upgrade info: seconds elapsed / remaining: 940 / 1038

---> Upgrade status: RUNNING 55%

----> Upgrade info: current component: eCDM, description: Data Migration Completed 55%

----> Upgrade info: seconds elapsed / remaining: 951 / 1038

---> Upgrade status: RUNNING 57%

----> Upgrade info: current component: eCDM, description: Data Migration Completed 57%

----> Upgrade info: seconds elapsed / remaining: 962 / 1038

---> Upgrade status: RUNNING 66%

----> Upgrade info: current component: eCDM, description: Data Manager Core Services Update Completed. Waiting for Other Components to Update... 71%

----> Upgrade info: seconds elapsed / remaining: 1027 / 1038