Short topics related to data storage essentials.
CloudIQ: Cloud-based Monitoring for your Dell Technologies IT Environment
Wed, 25 May 2022 19:49:28 -0000
|Read Time: 0 minutes
CloudIQ is Dell’s cloud-based AIOps application for monitoring Dell core, edge, and cloud. Born out of the Dell Unity storage product group several years ago, CloudIQ has quickly grown to cover a broad range of Dell Technologies products. With the latest addition of PowerSwitch, CloudIQ now covers Dell’s entire infrastructure portfolio, including compute, networking, CI/HCI, data protection, and storage systems.
According to a survey conducted last year, IT organizations were able to resolve infrastructure issues two to ten times faster and save a day per week on average with CloudIQ.1
Figure 1. CloudIQ Supported Platforms
CloudIQ has a variety of innovative features based on machine learning and other algorithms that help you reduce risk, plan ahead, and improve productivity. These features include the proactive health score, performance impact and anomaly detection, workload contention identification, capacity forecasting and anomaly detection, cybersecurity monitoring, reclaimable storage identification, and VMware integration.
With custom reporting features, Webhooks, and a REST API, you can integrate data from CloudIQ into ticketing, collaboration, and automation tools and processes that you use in day-to-day IT operations.
Best of all, CloudIQ comes with your standard Dell ProSupport and ProSupport Plus contracts at no extra cost.
Keep an eye out for follow up blogs discussing CloudIQ’s key features in more detail!
Figure 2. CloudIQ Overview Page
With the addition of PowerSwitch support, CloudIQ now gives users the ability to monitor the full range of their Dell Technologies IT infrastructure from a single user interface. And the fact that it is a cloud offering hosted in a secure Dell IT environment means that it is accessible from virtually anywhere. Simply open a web browser, point to https://cloudiq.dell.com, and log in with your Dell support credentials. As a cloud-based application, it also means that you always have access to the latest features because CloudIQ’s agile development process allows for continuous and seamless updates without any effort from you. There is also a mobile app, so you can take it anywhere.
How do you become more familiar with Dell Technologies and CloudIQ? The Dell Technologies Info Hub site provides expertise that helps to ensure customer success with Dell Technologies platforms. We have CloudIQ demos, white papers, and videos available at the Dell Technologies CloudIQ page. Also, feel free to reference the CloudIQ Overview Whitepaper which provides an in-depth summary of CloudIQ.
[1] Based on a Dell Technologies survey of CloudIQ users conducted May-June 2021. Actual results may vary.
Author: Derek Barboza, Senior Principal Engineering Technologist
How to Build a Custom Dell CSI Driver
Wed, 20 Apr 2022 21:28:38 -0000
|Read Time: 0 minutes
With all the Dell Container Storage Interface (CSI) drivers and dependencies being open-source, anyone can tweak them to fit a specific use case.
This blog shows how to create a patched version of a Dell CSI Driver for PowerScale.
As a practical example, the following steps show how to create a patched version of Dell CSI Driver for PowerScale that supports a longer mounted path.
The CSI Specification defines that a driver must accept a max path of 128 bytes minimal:
// SP SHOULD support the maximum path length allowed by the operating
// system/filesystem, but, at a minimum, SP MUST accept a max path
// length of at least 128 bytes.
Dell drivers use the gocsi library as a common boilerplate for CSI development. That library enforces the 128 bytes maximum path length.
The PowerScale hardware supports path lengths up to 1023 characters, as described in the File system guidelines chapter of the PowerScale spec. We’ll therefore build a csi-powerscale driver that supports that maximum length path value.
The Dell CSI drivers are all built with golang and, obviously, run as a container. As a result, the prerequisites are relatively simple. You need:
The first thing to do is to clone the official csi-powerscale repository in your GOPATH source directory.
cd $GOPATH/src/github.com/
git clone git@github.com:dell/csi-powerscale.git dell/csi-powerscale
cd dell/csi-powerscale
You can then pick the version of the driver you want to patch; git tag gives the list of versions.
In this example, we pick the v2.1.0 with git checkout v2.1.0 -b v2.1.0-longer-path.
The next step is to obtain the library we want to patch.
gocsi and every other open-source component maintained for Dell CSI are available on https://github.com/dell.
The following figure shows how to fork the repository on your private github:
Now we can get the library with:
cd $GOPATH/src/github.com/
git clone git@github.com:coulof/gocsi.git coulof/gocsi
cd coulof/gocsi
To simplify the maintenance and merge of future commits, it is wise to add the original repo as an upstream branch with:
git remote add upstream git@github.com:dell/gocsi.git
The next important step is to pick and choose the correct library version used by our version of the driver.
We can check the csi-powerscale dependency file with: grep gocsi $GOPATH/src/github.com/dell/csi-powerscale/go.mod and create a branch of that version. In this case, the version is v1.5.0, and we can branch it with: git checkout v1.5.0 -b v1.5.0-longer-path.
Now it’s time to hack our patch! Which is… just a oneliner:
--- a/middleware/specvalidator/spec_validator.go
+++ b/middleware/specvalidator/spec_validator.go
@@ -770,7 +770,7 @@ func validateVolumeCapabilitiesArg(
}
const (
- maxFieldString = 128
+ maxFieldString = 1023
maxFieldMap = 4096
maxFieldNodeId = 256
)
We can then commit and push our patched library with a nice tag:
git commit -a -m 'increase path limit'
git push --set-upstream origin v1.5.0-longer-path
git tag -a v1.5.0-longer-path
git push --tags
With the patch committed and pushed, it’s time to build the CSI driver binary and its container image.
Let’s go back to the csi-powerscale main repo: cd $GOPATH/src/github.com/dell/csi-powerscale
As mentioned in the introduction, we can take advantage of the replace directive in the go.mod file to point to the patched lib. In this case we add the following:
diff --git a/go.mod b/go.mod
index 5c274b4..c4c8556 100644
--- a/go.mod
+++ b/go.mod
@@ -26,6 +26,7 @@ require (
)
replace (
+ github.com/dell/gocsi => github.com/coulof/gocsi v1.5.0-longer-path
k8s.io/api => k8s.io/api v0.20.2
k8s.io/apiextensions-apiserver => k8s.io/apiextensions-apiserver v0.20.2
k8s.io/apimachinery => k8s.io/apimachinery v0.20.2
When that is done, we obtain the new module from the online repo with: go mod download
Note: If you want to test the changes locally only, we can use the replace directive to point to the local directory with:
replace github.com/dell/gocsi => ../../coulof/gocsi
We can then build our new driver binary locally with: make build
After compiling it successfully, we can create the image. The shortest path to do that is to replace the csi-isilon binary from the dellemc/csi-isilon docker image with:
cat << EOF > Dockerfile.patch
FROM dellemc/csi-isilon:v2.1.0
COPY "csi-isilon" .
EOF
docker build -t coulof/csi-isilon:v2.1.0-long-path -f Dockerfile.patch .
Alternatively, you can rebuild an entire docker image using provided Makefile.
By default, the driver uses a Red Hat Universal Base Image minimal. That base image sometimes misses dependencies, so you can use another flavor, such as:
BASEIMAGE=registry.fedoraproject.org/fedora-minimal:latest REGISTRY=docker.io IMAGENAME=coulof/csi-powerscale IMAGETAG=v2.1.0-long-path make podman-build
The image is ready to be pushed in whatever image registry you prefer. In this case, this is hub.docker.com: docker push coulof/csi-isilon:v2.1.0-long-path.
The last step is to replace the driver image used in your Kubernetes with your custom one.
Again, multiple solutions are possible, and the one to choose depends on how you deployed the driver.
If you used the helm installer, you can add the following block at the top of the myvalues.yaml file:
images:
driver: docker.io/coulof/csi-powerscale:v2.1.0-long-path
Then update or uninstall/reinstall the driver as described in the documentation.
If you decided to use the Dell CSI Operator, you can simply point to the new image:
apiVersion: storage.dell.com/v1
kind: CSIIsilon
metadata:
name: isilon
spec:
driver:
common:
image: "docker.io/coulof/csi-powerscale:v2.1.0-long-path"
...
Or, if you want to do a quick and dirty test, you can create a patch file (here named path_csi-isilon_controller_image.yaml) with the following content:
spec:
template:
spec:
containers:
- name: driver
image: docker.io/coulof/csi-powerscale:v2.1.0-long-path
You can then apply it to your existing install with: kubectl patch deployment -n powerscale isilon-controller --patch-file path_csi-isilon_controller_image.yaml
In all cases, you can check that everything works by first making sure that the Pod is started:
kubectl get pods -n powerscale
and that the logs are clean:
kubectl logs -n powerscale -l app=isilon-controller -c driver.
As demonstrated, thanks to the open source, it’s easy to fix and improve Dell CSI drivers or Dell Container Storage Modules.
Keep in mind that Dell officially supports (through tickets, Service Requests, and so on) the image and binary, but not the custom build.
Thanks for reading and stay tuned for future posts on Dell Storage and Kubernetes!
Author: Florian Coulombel
Announcing: the New Dell SRM Hands on Lab
Thu, 07 Apr 2022 14:26:51 -0000
|Read Time: 0 minutes
We are happy to announce the release of the new SRM hands on lab:
This new SRM hands on lab is based on the latest SRM release (4.7.0.0), which introduced many new features, enhancements, and platform supports.
To find this lab, go to the demo center (https://democenter.delltechnologies.com) and enter “srm” in the search box. This link to the lab will appear:
The welcome screen on the lab looks like this. It includes a network diagram and a comprehensive lab guide:
In the first module, called “What’s New”, the lab focuses on the following new features, enhancements, and newly supported platforms:
The rest of the modules cover in-depth SRM use-cases listed below. Each module is independent so that you can focus on your area of interest:
and some of the main SRM features:
Check out some of the SRM dashboards available:
The lab includes a great variety of SRM reports containing data from supported vendors and technologies:
|
|
The SRM 4.7.0.0 hands on lab helps you experience SRM use-cases and features, by browsing through the powerful user interface and elaborating on data from multiple vendors and technologies.
Enjoy the SRM hands on lab! If you have any questions, please contact us at support@democenter.dell.com.
Author: Dejan Stojanovic
Looking Ahead: Dell Container Storage Modules 1.2
Mon, 21 Mar 2022 14:31:56 -0000
|Read Time: 0 minutes
The quarterly update for Dell CSI Drivers & Dell Container Storage Modules (CSM) is here! Here’s what we’re planning.
Dell Container Storage Modules (CSM) add data services and features that are not in the scope of the CSI specification today. The new CSM Operator simplifies the deployment of CSMs. With an ever-growing ecosystem and added features, deploying a driver and its affiliated modules need to be carefully studied before beginning the deployment.
The new CSM Operator:
In the short/middle term, the CSM Operator will deprecate the experimental CSM Installer.
For disaster recovery protection, PowerScale implements data replication between appliances by means of the the SyncIQ feature. SyncIQ replicates the data between two sites, where one is read-write while the other is read-only, similar to Dell storage backends with async or sync replication.
The role of the CSM replication module and underlying CSI driver is to provision the volume within Kubernetes clusters and prepare the export configurations, quotas, and so on.
CSM Replication for PowerScale has been designed and implemented in such a way that it won’t collide with your existing Superna Eyeglass DR utility.
A live-action demo will be posted in the coming weeks on our VP YouTube channel: https://www.youtube.com/user/itzikreich/.
In this release, each CSI driver:
Kubernetes v1.19 introduced the fsGroupPolicy to give more control to the CSI driver over the permission sets in the securityContext.
There are three possible options:
In all cases, Dell CSI drivers let kubelet perform the change ownership operations and do not do it at the driver level.
Drivers for PowerFlex and Unity can now be installed with the help of the install scripts we provide under the dell-csi-installer directory.
A standalone Helm chart helps to easily integrate the driver installation with the agent for Continuous Deployment like Flux or Argo CD.
Note: To ensure that you install the driver on a supported Kubernetes version, the Helm charts take advantage of the kubeVersion field. Some Kubernetes distributions use labels in kubectl version (such as v1.21.3-mirantis-1 and v1.20.7-eks-1-20-7) that require manual editing.
Drivers for PowerFlex and Unity implement Volume Health Monitoring.
This feature is currently in alpha in Kubernetes (in Q1-2022), and is disabled with a default installation.
Once enabled, the drivers will expose the standard storage metrics, such as capacity usage and inode usage through the Kubernetes /metrics endpoint. The metrics will flow natively in popular dashboards like the ones built-in OpenShift Monitoring:
All Dell drivers and dependencies like gopowerstore, gobrick, and more are now on Github and will be fully open-sourced. The umbrella project is and remains https://github.com/dell/csm, from which you can open tickets and see the roadmap.
The Dell partnership with Google continues, and the latest CSI drivers for PowerScale and PowerStore support Anthos v1.9.
Both CSI PowerScale and PowerStore now allow setting the default permissions for the newly created volume. To do this, you can use POSIX octal notation or ACL.
For more details you can:
Author: Florian Coulombel
PowerMax and PowerStore Cyber Security
Tue, 15 Mar 2022 19:08:11 -0000
|Read Time: 0 minutes
Dell Technologies takes a comprehensive approach to cyber resiliency and is committed to helping customers achieve their security objectives and requirements. Storage Engineering Technologists Richard Pace, Justin Bastin, and Derek Barboza worked together, cross platform, to deliver three independent cyber security white papers for PowerMax, Mainframe, and PowerStore:
Each paper acts as a single point where customers can gain an understanding of the respective robust features and data services available to safeguard sensitive and mission critical data in the event of a cyber crime. All three papers leverage CloudIQ and the CyberSecurity feature to provide customers insight in anomaly detection.
The following figure shows a CloudIQ anomaly that indicates unusual behavior in a customer’s environment:
Backed by CyberSecurity in CloudIQ, we can see how quickly CloudIQ detects the issue and provides the details for manual remediation.
Dell has an ingrained culture of security. We follow a 'shift-left' approach that ensures that security is baked into every process in the development life cycle. The Dell Secure Development Lifecycle (SDL) defines security controls based on industry standards that Dell product teams adopt while developing new features and functionality. Dell’s SDL defines security controls that our product teams adopt while developing new features and functionality. Our SDL includes both analysis activities and prescriptive proactive controls around key risk areas.
Dell strives to help our customers minimize risk associated with security vulnerabilities in our products. Our goal is to provide customers with timely information, guidance, and mitigation options to address vulnerabilities. The Dell Product Security Incident Response Team (Dell PSIRT) is chartered and responsible for coordinating the response and disclosure for all product vulnerabilities that are reported to Dell. Dell employs a rigorous process to continually evaluate and improve our vulnerability response practices, and regularly benchmarks these against the rest of the industry.
Authors: Richard Pace, Justin F. Bastin
CSI drivers 2.0 and Dell EMC Container Storage Modules GA!
Thu, 14 Oct 2021 11:40:35 -0000
|Read Time: 0 minutes
The quarterly update for Dell CSI Driver is here! But today marks a significant milestone because we are also announcing the availability of Dell EMC Container Storage Modules (CSM). Here’s what we’re covering in this blog:
Dell Container Storage Modules is a set of modules that aims to extend Kubernetes storage features beyond what is available in the CSI specification.
The CSM modules will expose storage enterprise features directly within Kubernetes, so developers are empowered to leverage them for their deployment in a seamless way.
Most of these modules are released as sidecar containers that work with the CSI driver for the Dell storage array technology you use.
CSM modules are open-source and freely available from : https://github.com/dell/csm.
Many stateful apps can run on top of multiple volumes. For example, we can have a transactional DB like Postgres with a volume for its data and another for the redo log, or Cassandra that is distributed across nodes, each having a volume, and so on.
When you want to take a recoverable snapshot, it is vital to take them consistently at the exact same time.
Dell CSI Volume Group Snapshotter solves that problem for you. With the help of a CustomResourceDefinition, an additional sidecar to the Dell CSI drivers, and leveraging vanilla Kubernetes snapshots, you can manage the life cycle of crash-consistent snapshots. This means you can create a group of volumes for which the drivers create snapshots, restore them, or move them with one shot simultaneously!
To take a crash-consistent snapshot, you can either use labels on your PersistantVolumeClaim, or be explicit and pass the list of PVCs that you want to snap. For example:
apiVersion: v1 apiVersion: volumegroup.storage.dell.com/v1alpha2 kind: DellCsiVolumeGroupSnapshot metadata: # Name must be 13 characters or less in length name: "vg-snaprun1" spec: driverName: "csi-vxflexos.dellemc.com" memberReclaimPolicy: "Retain" volumesnapshotclass: "poweflex-snapclass" pvcLabel: "vgs-snap-label" # pvcList: # - "pvcName1" # - "pvcName2"
For the first release, CSI for PowerFlex supports Volume Group Snapshot.
The CSM Observability module is delivered as an open-telemetry agent that collects array-level metrics to scrape them for storage in a Prometheus DB.
The integration is as easy as creating a Prometheus ServiceMonitor for Prometheus. For example:
apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: otel-collector namespace: powerstore spec: endpoints: - path: /metrics port: exporter-https scheme: https tlsConfig: insecureSkipVerify: true selector: matchLabels: app.kubernetes.io/instance: karavi-observability app.kubernetes.io/name: otel-collector
With the observability module, you will gain visibility on the capacity of the volume you manage with Dell CSI drivers and their performance, in terms of bandwidth, IOPS, and response time.
Thanks to pre-canned Grafana dashboards, you will be able to go through these metrics’ history and see the topology between a Kubernetes PersistentVolume (PV) until its translation as a LUN or fileshare in the backend array.
The Kubernetes admin can also collect array level metrics to check the overall capacity performance directly from the familiar Prometheus/Grafana tools.
For the first release, Dell EMC PowerFlex and Dell EMC PowerStore support CSM Observability.
Each Dell storage array supports replication capabilities. It can be asynchronous with an associated recovery point objective, synchronous replication between sites, or even active-active.
Each replication type serves a different purpose related to the use-case or the constraint you have on your data centers.
The Dell CSM replication module allows creating a persistent volume that can be of any of three replication types -- synchronous, asynchronous, and metro -- assuming the underlying storage box supports it.
The Kubernetes architecture can build on a stretched cluster between two sites or on two or more independent clusters. The module itself is composed of three main components:
The usual workflow is to create a PVC that is replicated with a classic Kubernetes directive by just picking the right StorageClass. You can then use repctl or edit the DellCSIReplicationGroup CRD to launch operations like Failover, Failback, Reprotect, Suspend, Synchronize, and so on.
For the first release, Dell EMC PowerMax and Dell EMC PowerStore support CSM Replication.
With CSM Authorization we are giving back more control of storage consumption to the storage administrator.
The authorization module is an independent service, installed and owned by the storage admin.
Within that module, the storage administrator will create access control policies and storage quotas to make sure that Kubernetes consumers are not overconsuming storage or trying to access data that doesn’t belong to them.
CSM Authorization makes multi-tenant architecture real by enforcing Role-Based Access Control on storage objects coming from multiple and independent Kubernetes clusters.
The authorization module acts as a proxy between the CSI driver and the backend array. Access is granted with an access token that can be revoked at any point in time. Quotas can be changed on the fly to limit or increase storage consumption from the different tenants.
For the first release, Dell EMC PowerMax and Dell EMC PowerFlex support CSM Authorization.
When dealing with StatefulApp, if a node goes down, vanilla Kubernetes is pretty conservative.
Indeed, from the Kubernetes control plane, the failing node is seen as not ready. It can be because the node is down, or because of network partitioning between the control plane and the node, or simply because the kubelet is down. In the latter two scenarios, the StatefulApp is still running and possibly writing data on disk. Therefore, Kubernetes won’t take action and lets the admin manually trigger a Pod deletion if desired.
The CSM Resiliency module (sometimes named PodMon) aims to improve that behavior with the help of collected metrics from the array.
Because the driver has access to the storage backend from pretty much all other nodes, we can see the volume status (mapped or not) and its activity (are there IOPS or not). So when a node goes into NotReady state, and we see no IOPS on the volume, Resiliency will relocate the Pod to a new node and clean whatever leftover objects might exist.
The entire process happens in seconds between the moment a node is seen down and the rescheduling of the Pod.
To protect an app with the resiliency module, you only have to add the label podmon.dellemc.com/driver to it, and it is then protected.
For more details on the module’s design, you can check the documentation here.
For the first release, Dell EMC PowerFlex and Dell EMC Unity support CSM Resiliency.
Each module above is released either as an independent helm chart or as an option within the CSI Drivers.
For more complex deployments, which may involve multiple Kubernetes clusters or a mix of modules, it is possible to use the csm installer.
The CSM Installer, built on top of carvel gives the user a single command line to create their CSM-CSI application and to manage them outside the Kubernetes cluster.
For the first release, all drivers and modules support the CSM Installer.
For each driver, this release provides:
VMware Tanzu offers storage management by means of its CNS-CSI driver, but it doesn’t support ReadWriteMany access mode.
If your workload needs concurrent access to the filesystem, you can now rely on CSI Driver for PowerStore, PowerScale and Unity through the NFS protocol. The three platforms are officially supported and qualified on Tanzu.
NFS Driver, PowerStore, PowerScale, and Unity have all been tested and work when the Kubernetes cluster is behind a private network.
By default, the CSI driver creates volumes with 777 POSIX permission on the directory.
Now with the isiVolumePathPermissions parameter, you can use ACLs or any more permissive POSIX rights.
The isiVolumePathPermissions can be configured as part of the ConfigMap with the PowerScale settings or at the StorageClass level. The accepted parameter values are: private_read, private, public_read, public_read_write, and public for the ACL or any combination of [POSIX Mode].
For more details you can:
Author: Florian Coulombel