Demystifying CSI plug-in for PowerFlex (persistent volumes) with Red Hat OpenShift
Wed, 14 Oct 2020 18:12:01 -0000|
Read Time: 0 minutes
The Container Storage Interface (CSI) is a standard for exposing file and block storage to containerized workloads on Kubernetes, OpenShift and so on. CSI helps third-party storage providers (for example PowerFlex) to write plugins for OpenShift to consume storage from backends as persistent storage.
CSI driver for Dell EMC VxFlex OS can be installed using Dell EMC Storage CSI Operator. It is a community operator and can be deployed using OperatorHub.io.
Master nodes components do not communicate directly with CSI driver. It interacts only with API server on Master nodes. It MUST watch the Kubernetes API and trigger the appropriate CSI operations against it. Kubelet discovers CSI drivers using kubelet plug-in registration mechanism. It directly issues calls to CSI driver.
External Provisioner –The CSI external provisioner is a sidecar container that watches the k8s API server for PersistentVolumeClaim objects. It calls CreateVolume against the specified CSI endpoint to provision a volume.
External Attacher – The CSI external attacher is a sidecar container that watches the API server for VolumeAttachment objects and triggers controller [Publish|Unpublish] volume operations against a CSI endpoint.
- Node-driver-registrar – The CSI node driver registrar is a sidecar container that fetches driver information from a CSI endpoint and registers it with the kubelet on that node.
- Cluster-driver-registrar – The CSI cluster driver registrar is a sidecar container that registers a CSI driver with a k8s cluster by creating a CSIDriver object.
CSI Controller plug-in – The controller component can be deployed as a Deployment or StatefulSet on any node in the cluster. It consists of the CSI driver that implements the CSI Controller service.
CSI Identity – It enables k8s components and CSI containers to identify the driver.
CSI Node Plugin –The node component should be deployed on every node in the cluster through a DaemonSet. It consists of the CSI driver that implements the CSI Node service and the node driver registrar sidecar container.
CSI and Persistent Storage
Storage within OpenShift Container Platform 4.x is managed from worker nodes. The CSI API uses two new resources: PersistentVolume (PV) and PersistentVolumeClaim (PVC) objects.
Persistent Volumes – Kubernetes provides physical storage devices to the cluster in the form of objects called Persistent Volumes.
Persistent Volume Claim – This object lets pods use storage from Persistent Volumes.
Storage Class – This object helps you create PV/PVC pair for pods. It stores information about creating a persistent volume.
- name: powerflexos
- key: csi-vxflexos.dellemc.com/X_CSI_VXFLEXOS_SYSTEMNAME
- name: powerflex-xfs
- key: csi-vxflexos.dellemc.com/X_CSI_VXFLEXOS_SYSTEMNAME
CSI driver capabilities
Static Provisioning – This allows you to manually make existing PowerFlex storage available to the cluster.
Dynamic Provisioning - Storage volumes can be created on-demand. Storage resources are dynamically provisioned using the provisioner that is specified by the StorageClass object.
Retain Reclaiming – Once PersistentVolumeClaim is deleted, the corresponding PersistentVolume is not deleted rather moved to Released state and its data can be manually recovered.
Delete Reclaiming – It is the default reclaim policy and unlike Retain policy persistent volume is deleted.
Access Mode - ReadWriteOnce -- the volume can be mounted as read/write by a single node.
Supported FS - ext4/xfs.
Raw Block Volumes: Using Raw block option, PV can be attached to pod or app directly without formatting with ext4 or xfs file system.
Related Blog Posts
How to Build a Custom Dell CSI Driver
Wed, 20 Apr 2022 21:28:38 -0000|
Read Time: 0 minutes
With all the Dell Container Storage Interface (CSI) drivers and dependencies being open-source, anyone can tweak them to fit a specific use case.
This blog shows how to create a patched version of a Dell CSI Driver for PowerScale.
As a practical example, the following steps show how to create a patched version of Dell CSI Driver for PowerScale that supports a longer mounted path.
The CSI Specification defines that a driver must accept a max path of 128 bytes minimal:
// SP SHOULD support the maximum path length allowed by the operating // system/filesystem, but, at a minimum, SP MUST accept a max path // length of at least 128 bytes.
The PowerScale hardware supports path lengths up to 1023 characters, as described in the File system guidelines chapter of the PowerScale spec. We’ll therefore build a csi-powerscale driver that supports that maximum length path value.
Steps to patch a driver
The Dell CSI drivers are all built with golang and, obviously, run as a container. As a result, the prerequisites are relatively simple. You need:
- Golang (v1.16 minimal at the time of the publication of that post)
- Podman or Docker
- And optionally make to run our Makefile
Clone, branch, and patch
The first thing to do is to clone the official csi-powerscale repository in your GOPATH source directory.
cd $GOPATH/src/github.com/ git clone firstname.lastname@example.org:dell/csi-powerscale.git dell/csi-powerscale cd dell/csi-powerscale
You can then pick the version of the driver you want to patch; git tag gives the list of versions.
In this example, we pick the v2.1.0 with git checkout v2.1.0 -b v2.1.0-longer-path.
The next step is to obtain the library we want to patch.
gocsi and every other open-source component maintained for Dell CSI are available on https://github.com/dell.
The following figure shows how to fork the repository on your private github:
Now we can get the library with:
cd $GOPATH/src/github.com/ git clone email@example.com:coulof/gocsi.git coulof/gocsi cd coulof/gocsi
To simplify the maintenance and merge of future commits, it is wise to add the original repo as an upstream branch with:
git remote add upstream firstname.lastname@example.org:dell/gocsi.git
The next important step is to pick and choose the correct library version used by our version of the driver.
We can check the csi-powerscale dependency file with: grep gocsi $GOPATH/src/github.com/dell/csi-powerscale/go.mod and create a branch of that version. In this case, the version is v1.5.0, and we can branch it with: git checkout v1.5.0 -b v1.5.0-longer-path.
Now it’s time to hack our patch! Which is… just a oneliner:
--- a/middleware/specvalidator/spec_validator.go +++ b/middleware/specvalidator/spec_validator.go @@ -770,7 +770,7 @@ func validateVolumeCapabilitiesArg( } const ( - maxFieldString = 128 + maxFieldString = 1023 maxFieldMap = 4096 maxFieldNodeId = 256 )
We can then commit and push our patched library with a nice tag:
git commit -a -m 'increase path limit' git push --set-upstream origin v1.5.0-longer-path git tag -a v1.5.0-longer-path git push --tags
With the patch committed and pushed, it’s time to build the CSI driver binary and its container image.
Let’s go back to the csi-powerscale main repo: cd $GOPATH/src/github.com/dell/csi-powerscale
diff --git a/go.mod b/go.mod index 5c274b4..c4c8556 100644 --- a/go.mod +++ b/go.mod @@ -26,6 +26,7 @@ require ( ) replace ( + github.com/dell/gocsi => github.com/coulof/gocsi v1.5.0-longer-path k8s.io/api => k8s.io/api v0.20.2 k8s.io/apiextensions-apiserver => k8s.io/apiextensions-apiserver v0.20.2 k8s.io/apimachinery => k8s.io/apimachinery v0.20.2
When that is done, we obtain the new module from the online repo with: go mod download
Note: If you want to test the changes locally only, we can use the replace directive to point to the local directory with:
replace github.com/dell/gocsi => ../../coulof/gocsi
We can then build our new driver binary locally with: make build
After compiling it successfully, we can create the image. The shortest path to do that is to replace the csi-isilon binary from the dellemc/csi-isilon docker image with:
cat << EOF > Dockerfile.patch FROM dellemc/csi-isilon:v2.1.0 COPY "csi-isilon" . EOF docker build -t coulof/csi-isilon:v2.1.0-long-path -f Dockerfile.patch .
Alternatively, you can rebuild an entire docker image using provided Makefile.
By default, the driver uses a Red Hat Universal Base Image minimal. That base image sometimes misses dependencies, so you can use another flavor, such as:
BASEIMAGE=registry.fedoraproject.org/fedora-minimal:latest REGISTRY=docker.io IMAGENAME=coulof/csi-powerscale IMAGETAG=v2.1.0-long-path make podman-build
The image is ready to be pushed in whatever image registry you prefer. In this case, this is hub.docker.com: docker push coulof/csi-isilon:v2.1.0-long-path.
Update CSI Kubernetes deployment
The last step is to replace the driver image used in your Kubernetes with your custom one.
Again, multiple solutions are possible, and the one to choose depends on how you deployed the driver.
If you used the helm installer, you can add the following block at the top of the myvalues.yaml file:
images: driver: docker.io/coulof/csi-powerscale:v2.1.0-long-path
Then update or uninstall/reinstall the driver as described in the documentation.
If you decided to use the Dell CSI Operator, you can simply point to the new image:
apiVersion: storage.dell.com/v1 kind: CSIIsilon metadata: name: isilon spec: driver: common: image: "docker.io/coulof/csi-powerscale:v2.1.0-long-path" ...
Or, if you want to do a quick and dirty test, you can create a patch file (here named path_csi-isilon_controller_image.yaml) with the following content:
spec: template: spec: containers: - name: driver image: docker.io/coulof/csi-powerscale:v2.1.0-long-path
You can then apply it to your existing install with: kubectl patch deployment -n powerscale isilon-controller --patch-file path_csi-isilon_controller_image.yaml
In all cases, you can check that everything works by first making sure that the Pod is started:
kubectl get pods -n powerscale
and that the logs are clean:
kubectl logs -n powerscale -l app=isilon-controller -c driver.
Wrap-up and disclaimer
As demonstrated, thanks to the open source, it’s easy to fix and improve Dell CSI drivers or Dell Container Storage Modules.
Keep in mind that Dell officially supports (through tickets, Service Requests, and so on) the image and binary, but not the custom build.
Thanks for reading and stay tuned for future posts on Dell Storage and Kubernetes!
Author: Florian Coulombel
How PowerFlex Transforms Big Data with VMware Tanzu Greenplum
Wed, 13 Apr 2022 13:16:23 -0000|
Read Time: 0 minutes
Quick! The word has just come down. There is a new initiative that requires a massively parallel processing (MPP) database, and you are in charge of implementing it. What are you going to do? Luckily, you know the answer. You also just discovered that the Dell PowerFlex Solutions team has you covered with a solutions guide for VMware Tanzu Greenplum.
What is in the solutions guide and how will it help with an MPP database? This blog provides the answer. We look at what Greenplum is and how to leverage Dell PowerFlex for both the storage and compute resources in Greenplum.
Infrastructure flexibility: PowerFlex
If you have read my other blogs or are familiar with PowerFlex, you know it has powerful transmorphic properties. For example, PowerFlex nodes sometimes function as both storage and compute, like hyperconverged infrastructure (HCI). At other times, PowerFlex functions as a storage-only (SO) node or a compute-only (CO) node. Even more interesting, these node types can be mixed and matched in the same environment to meet the needs of the organization and the workloads that they run.
This transmorphic property of PowerFlex is helpful in a Greenplum deployment, especially with the configuration described in the solutions guide. Because the deployment is built on open-source PostgreSQL, it is optimized for the needs of an MPP database, like Greenplum. PowerFlex can deliver the compute performance necessary to support massive data IO with its CO nodes. The PowerFlex infrastructure can also support workloads running on CO nodes or nodes that combine compute and storage (hybrid nodes). By leveraging the malleable nature of PowerFlex, no additional silos are needed in the data center, and it may even help remove existing ones.
The architecture used in the solutions guide consists of 12 CO nodes and 10 SO nodes. The CO nodes have VMware ESXi installed on them, with Greenplum instances deployed on top. There are 10 segments and one director deployed for the Greenplum environment. The 12th CO node is used for redundancy.
The storage tier uses the 10 SO nodes to deliver 12 volumes backed by SSDs. This configuration creates a high speed, highly redundant storage system that is needed for Greenplum. Also, two protection domains are used to provide both primary and mirror storage for the Greenplum instances. Greenplum mirrors the volumes between those protection domains, adding an additional level of protection to the environment, as shown in the following figure:
By using this fluid and composable architecture, the components can be scaled independently of one another, allowing for storage to be increased either independently or together with compute. Administrators can use this configuration to optimize usage and deliver appropriate resources as needed without creating silos in the environment.
Testing and validation with Greenplum: we have you covered
The solutions guide not only describes how to build a Greenplum environment, it also addresses testing, which many administrators want to perform before they finish a build. The guide covers performing basic validations with FIO and gpcheckperf. In the simplest terms, these tools ensure that IO, memory, and network performance are acceptable. The FIO tests that were run for the guide showed that the HBA was fully saturated, maximizing both read and write operations. The gpcheckperf testing showed a performance of 14,283.62 MB/sec for write workloads.
Wouldn’t you feel better if a Greenplum environment was tested with a real-world dataset? That is, taking it beyond just the minimum, maximum, and average numbers? The great news is that the architecture was tested that way! Our Dell Digital team has developed an internal test suite running static benchmarked data. This test suite is used at Dell Technologies across new Greenplum environments as the gold standard for new deployments.
In this test design, all the datasets and queries are static. This scenario allows for a consistent measurement of the environment from one run to the next. It also provides a baseline of an environment that can be used over time to see how its performance has changed -- for example, if the environment sped up or slowed down following a software update.
Massive performance with real data
So how did the architecture fare? It did very well! When 182 parallel complex queries were run simultaneously to stress the system, it took just under 12 minutes for the test to run. In that time, the environment had a read bandwidth of 40 GB/s and a write bandwidth of 10 GB/s. These results are using actual production-based queries from the Dell Digital team workload. These results are close to saturating the network bandwidth for the environment, which indicates that there are no storage bottlenecks.
The design covered in this solution guide goes beyond simply verifying that the environment can handle the workload; it also shows how the configuration can maintain performance during ongoing operations.
Maintaining performance with snapshots
One of the key areas that we tested was the impact of snapshots on performance. Snapshots are a frequent operation in data centers and are used to create test copies of data as well as a source for backups. For this reason, consider the impact of snapshots on MPP databases when looking at an environment, not just how fast the database performs when it is first deployed.
In our testing, we used the native snapshot capabilities of PowerFlex to measure the impact that snapshots have on performance. Using PowerFlex snapshots provides significant flexibility in data protection and cloning operations that are commonly performed in data centers.
We found that when the first storage-consistent snapshot of the database volumes was taken, the test took 45 seconds longer to complete than initial tests. This result was because it was the first snapshot of the volumes. Follow-on snapshots during testing resulted in minimal impact to the environment. This minimal impact is significant for MPP databases in which performance is important. (Of course, performance can vary with each deployment.)
We hope that these findings help administrators who are building a Greenplum environment feel more at ease. You not only have a solution guide to refer to as you architect the environment, you can be confident that it was built on best-in-class infrastructure and validated using common testing tools and real-world queries.
The bottom line
Now that you know the assignment is coming to build an MPP database using VMware Tanzu Greenplum -- are you up to the challenge?
If you are, be sure to read the solution guide. If you need additional guidance on building your Greenplum environment on PowerFlex, be sure to reach out to your Dell representative.