Mon, 20 May 2024 17:00:00 -0000
|Read Time: 0 minutes
We are excited to announce the availability of Dell APEX Navigator for Kubernetes! This offering is part of the Dell Premier account experience that includes the APEX Navigator user interface and shares the management interface with APEX Navigator for Storage. In a three-part blog series, we will go through the key aspects of the APEX Navigator for Kubernetes:
Once you login as an ITOPs user, you can navigate to the Kubernetes section under Manage section of the left side navigation bar:
The details pane has four tabs:
Kubernetes clusters both on-prem and on public clouds can be managed with APEX Navigator for Kubernetes.
Before you onboard a cluster, please go through the following steps to make sure the cluster is ready to be onboarded.
Dell CSM Operator is a Kubernetes Operator designed to manage the Dell Storage CSI drivers and Container Storage Modules. Install v1.4.4 or later using the instructions here.
The Dell Connectivity Client initiates a connection to https://connect-into.dell.com/ in order to communicate with APEX Navigator for Kubernetes. Therefore, the firewall and proxy between the Kubernetes cluster and that address must be opened.
To do this first make sure the following namespace are created on the cluster:
$ kubectl create namespace karavi dell-csm dell-connectivity-client
You can get the custom resource definition (CRD) YAML file to install the Dell Connectivity Client resource from the CSM Operator GitHub repo. Once you have the YAML file you can install the client service as follows:
$ kubectl apply -f dcm-client.yml
You can verify the installation to see an output like below:
$ kubectl get pods -n dell-connectivity-client
NAME READY STATUS RESTARTS AGE
dell-connectivity-client-0 3/3 Running 0 70s
Note: if you remove a cluster, please note that you need to re-install the Dell client before you onboard it again.
On the License tab, you can add the different licenses that you have using APEX Navigator for Kubernetes. You will be assigning one of these licenses to the cluster once connected.
Once you have the CSM Operator and Dell Connectivity Client running on the cluster, you can connect to the cluster from the APEX Navigator UI. Here are the steps involved in establishing trust between the APEX Navigator and the Kubernetes cluster to onboard the cluster.
Follow the instructions on the UI to create the command that you need to run on your cluster to generate a token and then copy the token (underlined in the figure below) and paste it in the Install token field:
After this step, another command is generated that needs to be run on the cluster to complete the trusted connection process, as show in the following figure:
Once the cluster successfully connects, you may see the cluster is still listed in grey color indicating that it requires a license. Click on the ellipsis (…) button under the Actions column, on the right-hand side of the cluster row and select “Manage license”. In the License selection dialog, you can select the License that you want to assign the cluster. This step completes the onboarding of the cluster.
Following are the steps to remove a Kubernetes cluster from APEX Navigator for Kubernetes:
1. Uninstall all modules:
2. Unassign the license
3. Remove the cluster from the interface.
4. Uninstall the connectivity client on your cluster:
kubectl delete -n dell-connectivity-client apexconnectivityclients
After these four steps, the cluster is cleaned from every Dell CSI/CSM/APEX Navigator resource.
APEX Navigator for Kubernetes supports both on-prem and on-cloud Dell storage platforms. On-prem storage systems can be added using a simple dialog as shown below:
If you would like to use APEX Block storage on AWS, please make sure you have the required licenses for APEX Navigator for Storage and have onboarded your AWS account onto the APEX Navigator platform. You can deploy an APEX Block Storage cluster on AWS with just a few clicks (watch this demo video on YouTube) and start using the cloud storage for Kubernetes
Authors:
Parasar Kodati, Engineering Technologist, Dell ISG
Florian Coulombel, Engineering Technologist, Dell ISG
Mon, 20 May 2024 17:00:00 -0000
|Read Time: 0 minutes
This is part 2 of the three-part blog post series introducing Dell APEX Navigator for Kubernetes. In this post, we will cover batch deployment of CSMs on any number of onboarded Kubernetes clusters.
A major advantage of using APEX Navigator for Kubernetes is the ability to deploy multiple CSMs onto multiple Kubernetes clusters which consume storage from different Dell storage systems (including Dell APEX Block Storage on AWS). Multiple install jobs are simultaneously launched on the clusters to enable parallel installation which saves time and effort for admins managing storage for a growing Kubernetes footprint. Let us see how this can be achieved.
From the Clusters tab click on Manage Clusters and select Install modules:
This launches the Module Installation wizard where you can install specific Dell Container Storage Modules and things like SDC client for PowerFlex storage for an entire set of clusters. This ensures the same storage class and other configuration parameters are used across all the clusters for consistency and standardization. In the first release of APEX Navigator for Kubernetes, only Observability, Authorization, and Application Mobility CSMs are supported. Over time more services will be added.
In the CSM deployment wizard, the first step is to select all the clusters where the CSMs need to be installed.
Then, you can select the storage systems for each of the clusters. In the figure below, the selected clusters are sharing the same storage.
In the next step, the Storage class is set for each cluster pair:
On the summary page of the wizard, you can review the install configurations and click Install to start the installation process. You can track the multiple parallel install jobs on multiple clusters:
Authors:
Parasar Kodati, Engineering Technologist, Dell ISG
Florian Coulombel, Engineering Technologist, Dell ISG
Mon, 20 May 2024 17:00:00 -0000
|Read Time: 0 minutes
This is part 3 of the three-part blog series introducing Dell APEX Navigator for Kubernetes.
Data and application mobility is an essential element in maintaining the required availability and service level for a given application. From a workload standpoint, the application needs to have a redundant instance at a target site that can be used as a failover instance. For this to work for stateful applications, we need to ensure data availability on the target site. Data mobility can be achieved in two ways:
The Replication container storage module for Dell storage platforms orchestrates the data replication using the storage platform’s native replication capabilities. The Application Mobility module on the other hand uses the host-based backup approach. While both the approaches work for Dell storage platforms through the command line interface, the first release of APEX Navigator for Kubernetes user interface supports only the host-based backup functionality called the Application Mobility Module.
The following are the pre-requisites for application mobility:
We already covered how to connect clusters and storage in previous sections. Let us see how to set up the S3 Object Store within the APEX Navigator for Kubernetes.
Navigate to the Storage tab and click on the “Add object store” button. This launches a dialog to add the details of the Object store:
If using Amazon Web services (AWS) S3 for the object store, the region on the Kubernetes backup storage location object needs to be updated prior to creating a clone. On each Kubernetes cluster where Application Mobility is installed, run this command to update the region:
kubectl patch backupstoragelocation/default -n dell-csm --type='merge' -p '{"spec":{"config":{"region":"<region- name>"}}}'
To start an Application mobility job, go to the Application Mobility tab and click “Create Clone”. This launches a wizard that takes you through the following steps:
You can track mobility jobs under Jobs section like below:
Authors:
Parasar Kodati, Engineering Technologist, Dell ISG
Florian Coulombel, Engineering Technologist, Dell ISG
Mon, 29 Apr 2024 19:20:40 -0000
|Read Time: 0 minutes
The Dell infrastructure portfolio spans the entire hybrid cloud, from storage to compute to networking, and all the software functionality to deploy, manage, and monitor different application stacks from traditional databases to containerized applications deployed on Kubernetes. When it comes to integrating the infrastructure portfolio with 3rd party IT Operations platforms, Ansible is at the top of the list in terms of expanding the scope and depth of integration.
Here is a summary of the enhancements we made to the various Ansible modules across the Dell portfolio in 2022:
For all Ansible projects you can track the progress, contribute, or report issues on individual repositories.
You can also join our DevOps and Automation community at: https://www.dell.com/community/Automation/bd-p/Automation.
Happy New Year and happy upgrades!
Authors: Parasar Kodati and Florian Coulombel
Mon, 29 Apr 2024 18:55:46 -0000
|Read Time: 0 minutes
Network connectivity is an essential part of any infrastructure architecture. When it comes to how Kubernetes connects to PowerScale, there are several options to configure the Container Storage Interface (CSI). In this post, we will cover the concepts and configuration you can implement.
The story starts with CSI plugin architecture.
Like all other Dell storage CSI, PowerScale CSI follows the Kubernetes CSI standard by implementing functions in two components.
The CSI controller plugin is deployed as a Kubernetes Deployment, typically with two or three replicas for high-availability, with only one instance acting as a leader. The controller is responsible for communicating with PowerScale, using Platform API to manage volumes (to PowerScale it’s to create/delete directories, NFS exports, and quotas), to update the NFS client list when a Pod moves, and so on.
A CSI node plugin is a Kubernetes DaemonSet, running on all nodes by default. It’s responsible for mounting the NFS export from PowerScale, to map the NFS mount path to a Pod as persistent storage, so that applications and users in the Pod can access the data on PowerScale.
Because CSI needs to access both PAPI (PowerScale Platform API) and NFS data, a single user role typically isn’t secure enough: the role for PAPI access will need more privileges than normal users.
According to the PowerScale CSI manual, CSI requires a user that has the following privileges to perform all CSI functions:
Privilege | Type |
ISI_PRIV_LOGIN_PAPI | Read Only |
ISI_PRIV_NFS | Read Write |
ISI_PRIV_QUOTA | Read Write |
ISI_PRIV_SNAPSHOT | Read Write |
ISI_PRIV_IFS_RESTORE | Read Only |
ISI_PRIV_NS_IFS_ACCESS | Read Only |
ISI_PRIV_IFS_BACKUP | Read Only |
Among these privileges, ISI_PRIV_SNAPSHOT and ISI_PRIV_QUOTA are only available in the System zone. And this complicates things a bit. To fully utilize these CSI features, such as volume snapshot, volume clone, and volume capacity management, you have to allow the CSI to be able to access the PowerScale System zone. If you enable the CSM for replication, the user needs the ISI_PRIV_SYNCIQ privilege, which is a System-zone privilege too.
By contrast, there isn’t any specific role requirement for applications/users in Kubernetes to access data: the data is shared by the normal NFS protocol. As long as they have the right ACL to access the files, they are good. For this data accessing requirement, a non-system zone is suitable and recommended.
These two access zones are defined in different places in CSI configuration files:
If an admin really cannot expose their System zone to the Kubernetes cluster, they have to disable the snapshot and quota features in the CSI installation configuration file (values.yaml). In this way, the PAPI access zone can be a non-System access zone.
The following diagram shows how the Kubernetes cluster connects to PowerScale access zones.
Normally a Kubernetes cluster comes with many networks: a pod inter-communication network, a cluster service network, and so on. Luckily, the PowerScale network doesn’t have to join any of them. The CSI pods can access a host’s network directly, without going through the Kubernetes internal network. This also has the advantage of providing a dedicated high-performance network for data transfer.
For example, on a Kubernetes host, there are two NICs: IP 192.168.1.x and 172.24.1.x. NIC 192.168.1.x is used for Kubernetes, and is aligned with its hostname. NIC 172.24.1.x isn’t managed by Kubernetes. In this case, we can use NIC 172.24.1.x for data transfer between Kubernetes hosts and PowerScale.
Because by default, the CSI driver will use the IP that is aligned with its hostname, to let CSI recognize the second NIC 172.24.1.x, we have explicitly set the IP range in “allowedNetworks” in the values.yaml file in the CSI driver installation. For example:
allowedNetworks: [172.24.1.0/24]
Also, in this network configuration, it’s unlikely that the Kubernetes internal DNS can resolve the PowerScale FQDN. So, we also have to make sure the “dnsPolicy” has been set to “ClusterFirstWithHostNet” in the values.yaml file. With this dnsPolicy, the CSI pods will reach the DNS server in /etc/resolv.conf in the host OS, not the internal DNS server of Kubernetes.
The following diagram shows the configuration mentioned above:
Please note that the “allowedNetworks” setting only affects the data access zone, and not the PAPI access zone. In fact, CSI just uses this parameter to decide which host IP should be set as the NFS client IP on the PowerScale side.
Regarding the network routing, CSI simply follows the OS route configuration. Because of that, if we want the PAPI access zone to go through the primary NIC (192.168.1.x), and have the data access zone to go through the second NIC (172.24.1.x), we have to change the route configuration of the Kubernetes host, not this parameter.
Hopefully this blog helps you understand the network configuration for PowerScale CSI better. Stay tuned for more information on Dell Containers & Storage!
Authors: Sean Zhan, Florian Coulombel
Mon, 29 Apr 2024 18:53:12 -0000
|Read Time: 0 minutes
Network connectivity is an essential part of any infrastructure architecture. When it comes to how Kubernetes connects to PowerScale, there are several options to configure the Container Storage Interface (CSI). In this post, we will cover the concepts and configuration you can implement.
The story starts with CSI plugin architecture.
Like all other Dell storage CSI, PowerScale CSI follows the Kubernetes CSI standard by implementing functions in two components.
The CSI controller plugin is deployed as a Kubernetes Deployment, typically with two or three replicas for high-availability, with only one instance acting as a leader. The controller is responsible for communicating with PowerScale, using Platform API to manage volumes (to PowerScale it’s to create/delete directories, NFS exports, and quotas), to update the NFS client list when a Pod moves, and so on.
A CSI node plugin is a Kubernetes DaemonSet, running on all nodes by default. It’s responsible for mounting the NFS export from PowerScale, to map the NFS mount path to a Pod as persistent storage, so that applications and users in the Pod can access the data on PowerScale.
Because CSI needs to access both PAPI (PowerScale Platform API) and NFS data, a single user role typically isn’t secure enough: the role for PAPI access will need more privileges than normal users.
According to the PowerScale CSI manual, CSI requires a user that has the following privileges to perform all CSI functions:
Privilege | Type |
ISI_PRIV_LOGIN_PAPI | Read Only |
ISI_PRIV_NFS | Read Write |
ISI_PRIV_QUOTA | Read Write |
ISI_PRIV_SNAPSHOT | Read Write |
ISI_PRIV_IFS_RESTORE | Read Only |
ISI_PRIV_NS_IFS_ACCESS | Read Only |
ISI_PRIV_IFS_BACKUP | Read Only |
Among these privileges, ISI_PRIV_SNAPSHOT and ISI_PRIV_QUOTA are only available in the System zone. And this complicates things a bit. To fully utilize these CSI features, such as volume snapshot, volume clone, and volume capacity management, you have to allow the CSI to be able to access the PowerScale System zone. If you enable the CSM for replication, the user needs the ISI_PRIV_SYNCIQ privilege, which is a System-zone privilege too.
By contrast, there isn’t any specific role requirement for applications/users in Kubernetes to access data: the data is shared by the normal NFS protocol. As long as they have the right ACL to access the files, they are good. For this data accessing requirement, a non-system zone is suitable and recommended.
These two access zones are defined in different places in CSI configuration files:
If an admin really cannot expose their System zone to the Kubernetes cluster, they have to disable the snapshot and quota features in the CSI installation configuration file (values.yaml). In this way, the PAPI access zone can be a non-System access zone.
The following diagram shows how the Kubernetes cluster connects to PowerScale access zones.
Normally a Kubernetes cluster comes with many networks: a pod inter-communication network, a cluster service network, and so on. Luckily, the PowerScale network doesn’t have to join any of them. The CSI pods can access a host’s network directly, without going through the Kubernetes internal network. This also has the advantage of providing a dedicated high-performance network for data transfer.
For example, on a Kubernetes host, there are two NICs: IP 192.168.1.x and 172.24.1.x. NIC 192.168.1.x is used for Kubernetes, and is aligned with its hostname. NIC 172.24.1.x isn’t managed by Kubernetes. In this case, we can use NIC 172.24.1.x for data transfer between Kubernetes hosts and PowerScale.
Because by default, the CSI driver will use the IP that is aligned with its hostname, to let CSI recognize the second NIC 172.24.1.x, we have explicitly set the IP range in “allowedNetworks” in the values.yaml file in the CSI driver installation. For example:
allowedNetworks: [172.24.1.0/24]
Also, in this network configuration, it’s unlikely that the Kubernetes internal DNS can resolve the PowerScale FQDN. So, we also have to make sure the “dnsPolicy” has been set to “ClusterFirstWithHostNet” in the values.yaml file. With this dnsPolicy, the CSI pods will reach the DNS server in /etc/resolv.conf in the host OS, not the internal DNS server of Kubernetes.
The following diagram shows the configuration mentioned above:
Please note that the “allowedNetworks” setting only affects the data access zone, and not the PAPI access zone. In fact, CSI just uses this parameter to decide which host IP should be set as the NFS client IP on the PowerScale side.
Regarding the network routing, CSI simply follows the OS route configuration. Because of that, if we want the PAPI access zone to go through the primary NIC (192.168.1.x), and have the data access zone to go through the second NIC (172.24.1.x), we have to change the route configuration of the Kubernetes host, not this parameter.
Hopefully this blog helps you understand the network configuration for PowerScale CSI better. Stay tuned for more information on Dell Containers & Storage!
Authors: Sean Zhan, Florian Coulombel
Mon, 29 Apr 2024 18:52:07 -0000
|Read Time: 0 minutes
Network connectivity is an essential part of any infrastructure architecture. When it comes to how Kubernetes connects to PowerScale, there are several options to configure the Container Storage Interface (CSI). In this post, we will cover the concepts and configuration you can implement.
The story starts with CSI plugin architecture.
Like all other Dell storage CSI, PowerScale CSI follows the Kubernetes CSI standard by implementing functions in two components.
The CSI controller plugin is deployed as a Kubernetes Deployment, typically with two or three replicas for high-availability, with only one instance acting as a leader. The controller is responsible for communicating with PowerScale, using Platform API to manage volumes (to PowerScale it’s to create/delete directories, NFS exports, and quotas), to update the NFS client list when a Pod moves, and so on.
A CSI node plugin is a Kubernetes DaemonSet, running on all nodes by default. It’s responsible for mounting the NFS export from PowerScale, to map the NFS mount path to a Pod as persistent storage, so that applications and users in the Pod can access the data on PowerScale.
Because CSI needs to access both PAPI (PowerScale Platform API) and NFS data, a single user role typically isn’t secure enough: the role for PAPI access will need more privileges than normal users.
According to the PowerScale CSI manual, CSI requires a user that has the following privileges to perform all CSI functions:
Privilege | Type |
ISI_PRIV_LOGIN_PAPI | Read Only |
ISI_PRIV_NFS | Read Write |
ISI_PRIV_QUOTA | Read Write |
ISI_PRIV_SNAPSHOT | Read Write |
ISI_PRIV_IFS_RESTORE | Read Only |
ISI_PRIV_NS_IFS_ACCESS | Read Only |
ISI_PRIV_IFS_BACKUP | Read Only |
Among these privileges, ISI_PRIV_SNAPSHOT and ISI_PRIV_QUOTA are only available in the System zone. And this complicates things a bit. To fully utilize these CSI features, such as volume snapshot, volume clone, and volume capacity management, you have to allow the CSI to be able to access the PowerScale System zone. If you enable the CSM for replication, the user needs the ISI_PRIV_SYNCIQ privilege, which is a System-zone privilege too.
By contrast, there isn’t any specific role requirement for applications/users in Kubernetes to access data: the data is shared by the normal NFS protocol. As long as they have the right ACL to access the files, they are good. For this data accessing requirement, a non-system zone is suitable and recommended.
These two access zones are defined in different places in CSI configuration files:
If an admin really cannot expose their System zone to the Kubernetes cluster, they have to disable the snapshot and quota features in the CSI installation configuration file (values.yaml). In this way, the PAPI access zone can be a non-System access zone.
The following diagram shows how the Kubernetes cluster connects to PowerScale access zones.
Normally a Kubernetes cluster comes with many networks: a pod inter-communication network, a cluster service network, and so on. Luckily, the PowerScale network doesn’t have to join any of them. The CSI pods can access a host’s network directly, without going through the Kubernetes internal network. This also has the advantage of providing a dedicated high-performance network for data transfer.
For example, on a Kubernetes host, there are two NICs: IP 192.168.1.x and 172.24.1.x. NIC 192.168.1.x is used for Kubernetes, and is aligned with its hostname. NIC 172.24.1.x isn’t managed by Kubernetes. In this case, we can use NIC 172.24.1.x for data transfer between Kubernetes hosts and PowerScale.
Because by default, the CSI driver will use the IP that is aligned with its hostname, to let CSI recognize the second NIC 172.24.1.x, we have explicitly set the IP range in “allowedNetworks” in the values.yaml file in the CSI driver installation. For example:
allowedNetworks: [172.24.1.0/24]
Also, in this network configuration, it’s unlikely that the Kubernetes internal DNS can resolve the PowerScale FQDN. So, we also have to make sure the “dnsPolicy” has been set to “ClusterFirstWithHostNet” in the values.yaml file. With this dnsPolicy, the CSI pods will reach the DNS server in /etc/resolv.conf in the host OS, not the internal DNS server of Kubernetes.
The following diagram shows the configuration mentioned above:
Please note that the “allowedNetworks” setting only affects the data access zone, and not the PAPI access zone. In fact, CSI just uses this parameter to decide which host IP should be set as the NFS client IP on the PowerScale side.
Regarding the network routing, CSI simply follows the OS route configuration. Because of that, if we want the PAPI access zone to go through the primary NIC (192.168.1.x), and have the data access zone to go through the second NIC (172.24.1.x), we have to change the route configuration of the Kubernetes host, not this parameter.
Hopefully this blog helps you understand the network configuration for PowerScale CSI better. Stay tuned for more information on Dell Containers & Storage!
Authors: Sean Zhan, Florian Coulombel
Mon, 29 Apr 2024 18:40:02 -0000
|Read Time: 0 minutes
The first release of 2023 for Kubernetes CSI Driver & Dell Container Storage Modules (CSM) is here!
The official changelog is available in the CHANGELOG directory of the CSM repository.
The newly supported Kubernetes distributions are:
Note: OpenShift 4.12 official qualification is not there yet. Indeed, these modules have been tested against Kubernetes 1.25 which is based on OpenShift 4.12. But you must install them using Helm package manager, not CSI or CSM Operators.
One of the major new features for CSI in CSM 1.6 is the Installation Wizard.
If you are a faithful reader of this blog series, you already know that Dell's CSI and CSM moved to pure Helm Charts and are distributed in our helm chart repository. This paved the way for the wizard installer!
Straight from the documentation portal, you can launch the wizard to configure and install the CSI and CSM modules for PowerStore and PowerMax. All the dependencies between CSI and CSM are managed.
The wizard doesn't aim to cover all use cases but gives an excellent default values.yaml, which can always be tuned later.
It has never been easier to install CSI and CSM!
cert-csi is Dell's test framework to validate and qualify drivers against the Kubernetes distributions.
If all tests from cert-csi pass, we call a platform (Linux OS + Kubernetes distribution) certified and officially supported by the Dell Engineering and Support structure.
With cert-csi open-sourced, the community can validate a platform, even if it’s not in the support matrix yet.
For more details about how to use the cert-csi utility, see the documentation portal.
The dellctl images CLI prints all the container images needed by Dell CSI drivers.
PowerMax Metro volumes are now fully compliant with the CSI specification for volume expansion, clone, and snapshot.
The CSM Operator is the future of the Operator framework for Dell CSI driver and Dell Container Storage Modules and now integrates the modules for PowerStore.
Kubernetes is notably conservative with StatefulSets on node failures: it won't reschedule them automatically and requires an administrator to force the deletion of the pods.
CSM resiliency solves that problem (and more) by querying the backend storage and getting the volumes' status to allow rescheduling in a few seconds after a node is NotReady for Kubernetes.
PowerStore is now part of the supported storage backends!
CSM replication supports PowerFlex and it is now possible to combine it with an offering of PowerFlex in AWS. For these types of designs, it is recommended to have low latency between the source and the target. For example, here is the architecture of our lab:
And the result of a replicated volume in PowerFlex UI in AWS looks like this:
To learn more about PowerFlex in AWS, see the video Dell Project Alpine Gets Real with PowerFlex on AWS and the blog Dell PowerFlex is now available on AWS.
CSM Observability can collect PowerMax metrics, including the performance of the storage groups that back the PVC, the capacity of the storage resource pool, and more.
Stay informed of the latest updates of the Dell CSM eco-system by subscribing to:
Author: Florian Coulombel
Mon, 29 Apr 2024 18:38:49 -0000
|Read Time: 0 minutes
The quarterly update for Dell CSI Driver is here! But today marks a significant milestone because we are also announcing the availability of Dell EMC Container Storage Modules (CSM). Here’s what we’re covering in this blog:
Dell Container Storage Modules is a set of modules that aims to extend Kubernetes storage features beyond what is available in the CSI specification.
The CSM modules will expose storage enterprise features directly within Kubernetes, so developers are empowered to leverage them for their deployment in a seamless way.
Most of these modules are released as sidecar containers that work with the CSI driver for the Dell storage array technology you use.
CSM modules are open-source and freely available from : https://github.com/dell/csm.
Many stateful apps can run on top of multiple volumes. For example, we can have a transactional DB like Postgres with a volume for its data and another for the redo log, or Cassandra that is distributed across nodes, each having a volume, and so on.
When you want to take a recoverable snapshot, it is vital to take them consistently at the exact same time.
Dell CSI Volume Group Snapshotter solves that problem for you. With the help of a CustomResourceDefinition, an additional sidecar to the Dell CSI drivers, and leveraging vanilla Kubernetes snapshots, you can manage the life cycle of crash-consistent snapshots. This means you can create a group of volumes for which the drivers create snapshots, restore them, or move them with one shot simultaneously!
To take a crash-consistent snapshot, you can either use labels on your PersistantVolumeClaim, or be explicit and pass the list of PVCs that you want to snap. For example:
apiVersion: v1 apiVersion: volumegroup.storage.dell.com/v1alpha2 kind: DellCsiVolumeGroupSnapshot metadata: # Name must be 13 characters or less in length name: "vg-snaprun1" spec: driverName: "csi-vxflexos.dellemc.com" memberReclaimPolicy: "Retain" volumesnapshotclass: "poweflex-snapclass" pvcLabel: "vgs-snap-label" # pvcList: # - "pvcName1" # - "pvcName2"
For the first release, CSI for PowerFlex supports Volume Group Snapshot.
The CSM Observability module is delivered as an open-telemetry agent that collects array-level metrics to scrape them for storage in a Prometheus DB.
The integration is as easy as creating a Prometheus ServiceMonitor for Prometheus. For example:
apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: otel-collector namespace: powerstore spec: endpoints: - path: /metrics port: exporter-https scheme: https tlsConfig: insecureSkipVerify: true selector: matchLabels: app.kubernetes.io/instance: karavi-observability app.kubernetes.io/name: otel-collector
With the observability module, you will gain visibility on the capacity of the volume you manage with Dell CSI drivers and their performance, in terms of bandwidth, IOPS, and response time.
Thanks to pre-canned Grafana dashboards, you will be able to go through these metrics’ history and see the topology between a Kubernetes PersistentVolume (PV) until its translation as a LUN or fileshare in the backend array.
The Kubernetes admin can also collect array level metrics to check the overall capacity performance directly from the familiar Prometheus/Grafana tools.
For the first release, Dell EMC PowerFlex and Dell EMC PowerStore support CSM Observability.
Each Dell storage array supports replication capabilities. It can be asynchronous with an associated recovery point objective, synchronous replication between sites, or even active-active.
Each replication type serves a different purpose related to the use-case or the constraint you have on your data centers.
The Dell CSM replication module allows creating a persistent volume that can be of any of three replication types -- synchronous, asynchronous, and metro -- assuming the underlying storage box supports it.
The Kubernetes architecture can build on a stretched cluster between two sites or on two or more independent clusters. The module itself is composed of three main components:
The usual workflow is to create a PVC that is replicated with a classic Kubernetes directive by just picking the right StorageClass. You can then use repctl or edit the DellCSIReplicationGroup CRD to launch operations like Failover, Failback, Reprotect, Suspend, Synchronize, and so on.
For the first release, Dell EMC PowerMax and Dell EMC PowerStore support CSM Replication.
With CSM Authorization we are giving back more control of storage consumption to the storage administrator.
The authorization module is an independent service, installed and owned by the storage admin.
Within that module, the storage administrator will create access control policies and storage quotas to make sure that Kubernetes consumers are not overconsuming storage or trying to access data that doesn’t belong to them.
CSM Authorization makes multi-tenant architecture real by enforcing Role-Based Access Control on storage objects coming from multiple and independent Kubernetes clusters.
The authorization module acts as a proxy between the CSI driver and the backend array. Access is granted with an access token that can be revoked at any point in time. Quotas can be changed on the fly to limit or increase storage consumption from the different tenants.
For the first release, Dell EMC PowerMax and Dell EMC PowerFlex support CSM Authorization.
When dealing with StatefulApp, if a node goes down, vanilla Kubernetes is pretty conservative.
Indeed, from the Kubernetes control plane, the failing node is seen as not ready. It can be because the node is down, or because of network partitioning between the control plane and the node, or simply because the kubelet is down. In the latter two scenarios, the StatefulApp is still running and possibly writing data on disk. Therefore, Kubernetes won’t take action and lets the admin manually trigger a Pod deletion if desired.
The CSM Resiliency module (sometimes named PodMon) aims to improve that behavior with the help of collected metrics from the array.
Because the driver has access to the storage backend from pretty much all other nodes, we can see the volume status (mapped or not) and its activity (are there IOPS or not). So when a node goes into NotReady state, and we see no IOPS on the volume, Resiliency will relocate the Pod to a new node and clean whatever leftover objects might exist.
The entire process happens in seconds between the moment a node is seen down and the rescheduling of the Pod.
To protect an app with the resiliency module, you only have to add the label podmon.dellemc.com/driver to it, and it is then protected.
For more details on the module’s design, you can check the documentation here.
For the first release, Dell EMC PowerFlex and Dell EMC Unity support CSM Resiliency.
Each module above is released either as an independent helm chart or as an option within the CSI Drivers.
For more complex deployments, which may involve multiple Kubernetes clusters or a mix of modules, it is possible to use the csm installer.
The CSM Installer, built on top of carvel gives the user a single command line to create their CSM-CSI application and to manage them outside the Kubernetes cluster.
For the first release, all drivers and modules support the CSM Installer.
For each driver, this release provides:
VMware Tanzu offers storage management by means of its CNS-CSI driver, but it doesn’t support ReadWriteMany access mode.
If your workload needs concurrent access to the filesystem, you can now rely on CSI Driver for PowerStore, PowerScale and Unity through the NFS protocol. The three platforms are officially supported and qualified on Tanzu.
NFS Driver, PowerStore, PowerScale, and Unity have all been tested and work when the Kubernetes cluster is behind a private network.
By default, the CSI driver creates volumes with 777 POSIX permission on the directory.
Now with the isiVolumePathPermissions parameter, you can use ACLs or any more permissive POSIX rights.
The isiVolumePathPermissions can be configured as part of the ConfigMap with the PowerScale settings or at the StorageClass level. The accepted parameter values are: private_read, private, public_read, public_read_write, and public for the ACL or any combination of [POSIX Mode].
For more details you can:
Author: Florian Coulombel
Mon, 29 Apr 2024 18:36:49 -0000
|Read Time: 0 minutes
The quarterly update for Dell CSI Drivers & Dell Container Storage Modules (CSM) is here! Here’s what we’re planning.
Dell Container Storage Modules (CSM) add data services and features that are not in the scope of the CSI specification today. The new CSM Operator simplifies the deployment of CSMs. With an ever-growing ecosystem and added features, deploying a driver and its affiliated modules need to be carefully studied before beginning the deployment.
The new CSM Operator:
In the short/middle term, the CSM Operator will deprecate the experimental CSM Installer.
For disaster recovery protection, PowerScale implements data replication between appliances by means of the the SyncIQ feature. SyncIQ replicates the data between two sites, where one is read-write while the other is read-only, similar to Dell storage backends with async or sync replication.
The role of the CSM replication module and underlying CSI driver is to provision the volume within Kubernetes clusters and prepare the export configurations, quotas, and so on.
CSM Replication for PowerScale has been designed and implemented in such a way that it won’t collide with your existing Superna Eyeglass DR utility.
A live-action demo will be posted in the coming weeks on our VP YouTube channel: https://www.youtube.com/user/itzikreich/.
In this release, each CSI driver:
Kubernetes v1.19 introduced the fsGroupPolicy to give more control to the CSI driver over the permission sets in the securityContext.
There are three possible options:
In all cases, Dell CSI drivers let kubelet perform the change ownership operations and do not do it at the driver level.
Drivers for PowerFlex and Unity can now be installed with the help of the install scripts we provide under the dell-csi-installer directory.
A standalone Helm chart helps to easily integrate the driver installation with the agent for Continuous Deployment like Flux or Argo CD.
Note: To ensure that you install the driver on a supported Kubernetes version, the Helm charts take advantage of the kubeVersion field. Some Kubernetes distributions use labels in kubectl version (such as v1.21.3-mirantis-1 and v1.20.7-eks-1-20-7) that require manual editing.
Drivers for PowerFlex and Unity implement Volume Health Monitoring.
This feature is currently in alpha in Kubernetes (in Q1-2022), and is disabled with a default installation.
Once enabled, the drivers will expose the standard storage metrics, such as capacity usage and inode usage through the Kubernetes /metrics endpoint. The metrics will flow natively in popular dashboards like the ones built-in OpenShift Monitoring:
All Dell drivers and dependencies like gopowerstore, gobrick, and more are now on Github and will be fully open-sourced. The umbrella project is and remains https://github.com/dell/csm, from which you can open tickets and see the roadmap.
The Dell partnership with Google continues, and the latest CSI drivers for PowerScale and PowerStore support Anthos v1.9.
Both CSI PowerScale and PowerStore now allow setting the default permissions for the newly created volume. To do this, you can use POSIX octal notation or ACL.
For more details you can:
Author: Florian Coulombel
Mon, 29 Apr 2024 18:15:25 -0000
|Read Time: 0 minutes
With all the Dell Container Storage Interface (CSI) drivers and dependencies being open-source, anyone can tweak them to fit a specific use case.
This blog shows how to create a patched version of a Dell CSI Driver for PowerScale.
As a practical example, the following steps show how to create a patched version of Dell CSI Driver for PowerScale that supports a longer mounted path.
The CSI Specification defines that a driver must accept a max path of 128 bytes minimal:
// SP SHOULD support the maximum path length allowed by the operating
// system/filesystem, but, at a minimum, SP MUST accept a max path
// length of at least 128 bytes.
Dell drivers use the gocsi library as a common boilerplate for CSI development. That library enforces the 128 bytes maximum path length.
The PowerScale hardware supports path lengths up to 1023 characters, as described in the File system guidelines chapter of the PowerScale spec. We’ll therefore build a csi-powerscale driver that supports that maximum length path value.
The Dell CSI drivers are all built with golang and, obviously, run as a container. As a result, the prerequisites are relatively simple. You need:
The first thing to do is to clone the official csi-powerscale repository in your GOPATH source directory.
cd $GOPATH/src/github.com/
git clone git@github.com:dell/csi-powerscale.git dell/csi-powerscale
cd dell/csi-powerscale
You can then pick the version of the driver you want to patch; git tag gives the list of versions.
In this example, we pick the v2.1.0 with git checkout v2.1.0 -b v2.1.0-longer-path.
The next step is to obtain the library we want to patch.
gocsi and every other open-source component maintained for Dell CSI are available on https://github.com/dell.
The following figure shows how to fork the repository on your private github:
Now we can get the library with:
cd $GOPATH/src/github.com/
git clone git@github.com:coulof/gocsi.git coulof/gocsi
cd coulof/gocsi
To simplify the maintenance and merge of future commits, it is wise to add the original repo as an upstream branch with:
git remote add upstream git@github.com:dell/gocsi.git
The next important step is to pick and choose the correct library version used by our version of the driver.
We can check the csi-powerscale dependency file with: grep gocsi $GOPATH/src/github.com/dell/csi-powerscale/go.mod and create a branch of that version. In this case, the version is v1.5.0, and we can branch it with: git checkout v1.5.0 -b v1.5.0-longer-path.
Now it’s time to hack our patch! Which is… just a oneliner:
--- a/middleware/specvalidator/spec_validator.go
+++ b/middleware/specvalidator/spec_validator.go
@@ -770,7 +770,7 @@ func validateVolumeCapabilitiesArg(
}
const (
- maxFieldString = 128
+ maxFieldString = 1023
maxFieldMap = 4096
maxFieldNodeId = 256
)
We can then commit and push our patched library with a nice tag:
git commit -a -m 'increase path limit'
git push --set-upstream origin v1.5.0-longer-path
git tag -a v1.5.0-longer-path
git push --tags
With the patch committed and pushed, it’s time to build the CSI driver binary and its container image.
Let’s go back to the csi-powerscale main repo: cd $GOPATH/src/github.com/dell/csi-powerscale
As mentioned in the introduction, we can take advantage of the replace directive in the go.mod file to point to the patched lib. In this case we add the following:
diff --git a/go.mod b/go.mod
index 5c274b4..c4c8556 100644
--- a/go.mod
+++ b/go.mod
@@ -26,6 +26,7 @@ require (
)
replace (
+ github.com/dell/gocsi => github.com/coulof/gocsi v1.5.0-longer-path
k8s.io/api => k8s.io/api v0.20.2
k8s.io/apiextensions-apiserver => k8s.io/apiextensions-apiserver v0.20.2
k8s.io/apimachinery => k8s.io/apimachinery v0.20.2
When that is done, we obtain the new module from the online repo with: go mod download
Note: If you want to test the changes locally only, we can use the replace directive to point to the local directory with:
replace github.com/dell/gocsi => ../../coulof/gocsi
We can then build our new driver binary locally with: make build
After compiling it successfully, we can create the image. The shortest path to do that is to replace the csi-isilon binary from the dellemc/csi-isilon docker image with:
cat << EOF > Dockerfile.patch
FROM dellemc/csi-isilon:v2.1.0
COPY "csi-isilon" .
EOF
docker build -t coulof/csi-isilon:v2.1.0-long-path -f Dockerfile.patch .
Alternatively, you can rebuild an entire docker image using provided Makefile.
By default, the driver uses a Red Hat Universal Base Image minimal. That base image sometimes misses dependencies, so you can use another flavor, such as:
BASEIMAGE=registry.fedoraproject.org/fedora-minimal:latest REGISTRY=docker.io IMAGENAME=coulof/csi-powerscale IMAGETAG=v2.1.0-long-path make podman-build
The image is ready to be pushed in whatever image registry you prefer. In this case, this is hub.docker.com: docker push coulof/csi-isilon:v2.1.0-long-path.
The last step is to replace the driver image used in your Kubernetes with your custom one.
Again, multiple solutions are possible, and the one to choose depends on how you deployed the driver.
If you used the helm installer, you can add the following block at the top of the myvalues.yaml file:
images:
driver: docker.io/coulof/csi-powerscale:v2.1.0-long-path
Then update or uninstall/reinstall the driver as described in the documentation.
If you decided to use the Dell CSI Operator, you can simply point to the new image:
apiVersion: storage.dell.com/v1
kind: CSIIsilon
metadata:
name: isilon
spec:
driver:
common:
image: "docker.io/coulof/csi-powerscale:v2.1.0-long-path"
...
Or, if you want to do a quick and dirty test, you can create a patch file (here named path_csi-isilon_controller_image.yaml) with the following content:
spec:
template:
spec:
containers:
- name: driver
image: docker.io/coulof/csi-powerscale:v2.1.0-long-path
You can then apply it to your existing install with: kubectl patch deployment -n powerscale isilon-controller --patch-file path_csi-isilon_controller_image.yaml
In all cases, you can check that everything works by first making sure that the Pod is started:
kubectl get pods -n powerscale
and that the logs are clean:
kubectl logs -n powerscale -l app=isilon-controller -c driver.
As demonstrated, thanks to the open source, it’s easy to fix and improve Dell CSI drivers or Dell Container Storage Modules.
Keep in mind that Dell officially supports (through tickets, Service Requests, and so on) the image and binary, but not the custom build.
Thanks for reading and stay tuned for future posts on Dell Storage and Kubernetes!
Author: Florian Coulombel
Mon, 29 Apr 2024 18:11:07 -0000
|Read Time: 0 minutes
The quarterly update for Dell CSI Drivers & Dell Container Storage Modules (CSM) is here! Here’s what we’re planning.
Dell Container Storage Modules (CSM) add data services and features that are not in the scope of the CSI specification today. The new CSM Operator simplifies the deployment of CSMs. With an ever-growing ecosystem and added features, deploying a driver and its affiliated modules need to be carefully studied before beginning the deployment.
The new CSM Operator:
In the short/middle term, the CSM Operator will deprecate the experimental CSM Installer.
For disaster recovery protection, PowerScale implements data replication between appliances by means of the the SyncIQ feature. SyncIQ replicates the data between two sites, where one is read-write while the other is read-only, similar to Dell storage backends with async or sync replication.
The role of the CSM replication module and underlying CSI driver is to provision the volume within Kubernetes clusters and prepare the export configurations, quotas, and so on.
CSM Replication for PowerScale has been designed and implemented in such a way that it won’t collide with your existing Superna Eyeglass DR utility.
A live-action demo will be posted in the coming weeks on our VP YouTube channel: https://www.youtube.com/user/itzikreich/.
In this release, each CSI driver:
Kubernetes v1.19 introduced the fsGroupPolicy to give more control to the CSI driver over the permission sets in the securityContext.
There are three possible options:
In all cases, Dell CSI drivers let kubelet perform the change ownership operations and do not do it at the driver level.
Drivers for PowerFlex and Unity can now be installed with the help of the install scripts we provide under the dell-csi-installer directory.
A standalone Helm chart helps to easily integrate the driver installation with the agent for Continuous Deployment like Flux or Argo CD.
Note: To ensure that you install the driver on a supported Kubernetes version, the Helm charts take advantage of the kubeVersion field. Some Kubernetes distributions use labels in kubectl version (such as v1.21.3-mirantis-1 and v1.20.7-eks-1-20-7) that require manual editing.
Drivers for PowerFlex and Unity implement Volume Health Monitoring.
This feature is currently in alpha in Kubernetes (in Q1-2022), and is disabled with a default installation.
Once enabled, the drivers will expose the standard storage metrics, such as capacity usage and inode usage through the Kubernetes /metrics endpoint. The metrics will flow natively in popular dashboards like the ones built-in OpenShift Monitoring:
All Dell drivers and dependencies like gopowerstore, gobrick, and more are now on Github and will be fully open-sourced. The umbrella project is and remains https://github.com/dell/csm, from which you can open tickets and see the roadmap.
The Dell partnership with Google continues, and the latest CSI drivers for PowerScale and PowerStore support Anthos v1.9.
Both CSI PowerScale and PowerStore now allow setting the default permissions for the newly created volume. To do this, you can use POSIX octal notation or ACL.
For more details you can:
Author: Florian Coulombel
Mon, 29 Apr 2024 17:44:07 -0000
|Read Time: 0 minutes
The quarterly update for Dell CSI Driver is here! But today marks a significant milestone because we are also announcing the availability of Dell EMC Container Storage Modules (CSM). Here’s what we’re covering in this blog:
Dell Container Storage Modules is a set of modules that aims to extend Kubernetes storage features beyond what is available in the CSI specification.
The CSM modules will expose storage enterprise features directly within Kubernetes, so developers are empowered to leverage them for their deployment in a seamless way.
Most of these modules are released as sidecar containers that work with the CSI driver for the Dell storage array technology you use.
CSM modules are open-source and freely available from : https://github.com/dell/csm.
Many stateful apps can run on top of multiple volumes. For example, we can have a transactional DB like Postgres with a volume for its data and another for the redo log, or Cassandra that is distributed across nodes, each having a volume, and so on.
When you want to take a recoverable snapshot, it is vital to take them consistently at the exact same time.
Dell CSI Volume Group Snapshotter solves that problem for you. With the help of a CustomResourceDefinition, an additional sidecar to the Dell CSI drivers, and leveraging vanilla Kubernetes snapshots, you can manage the life cycle of crash-consistent snapshots. This means you can create a group of volumes for which the drivers create snapshots, restore them, or move them with one shot simultaneously!
To take a crash-consistent snapshot, you can either use labels on your PersistantVolumeClaim, or be explicit and pass the list of PVCs that you want to snap. For example:
apiVersion: v1 apiVersion: volumegroup.storage.dell.com/v1alpha2 kind: DellCsiVolumeGroupSnapshot metadata: # Name must be 13 characters or less in length name: "vg-snaprun1" spec: driverName: "csi-vxflexos.dellemc.com" memberReclaimPolicy: "Retain" volumesnapshotclass: "poweflex-snapclass" pvcLabel: "vgs-snap-label" # pvcList: # - "pvcName1" # - "pvcName2"
For the first release, CSI for PowerFlex supports Volume Group Snapshot.
The CSM Observability module is delivered as an open-telemetry agent that collects array-level metrics to scrape them for storage in a Prometheus DB.
The integration is as easy as creating a Prometheus ServiceMonitor for Prometheus. For example:
apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: otel-collector namespace: powerstore spec: endpoints: - path: /metrics port: exporter-https scheme: https tlsConfig: insecureSkipVerify: true selector: matchLabels: app.kubernetes.io/instance: karavi-observability app.kubernetes.io/name: otel-collector
With the observability module, you will gain visibility on the capacity of the volume you manage with Dell CSI drivers and their performance, in terms of bandwidth, IOPS, and response time.
Thanks to pre-canned Grafana dashboards, you will be able to go through these metrics’ history and see the topology between a Kubernetes PersistentVolume (PV) until its translation as a LUN or fileshare in the backend array.
The Kubernetes admin can also collect array level metrics to check the overall capacity performance directly from the familiar Prometheus/Grafana tools.
For the first release, Dell EMC PowerFlex and Dell EMC PowerStore support CSM Observability.
Each Dell storage array supports replication capabilities. It can be asynchronous with an associated recovery point objective, synchronous replication between sites, or even active-active.
Each replication type serves a different purpose related to the use-case or the constraint you have on your data centers.
The Dell CSM replication module allows creating a persistent volume that can be of any of three replication types -- synchronous, asynchronous, and metro -- assuming the underlying storage box supports it.
The Kubernetes architecture can build on a stretched cluster between two sites or on two or more independent clusters. The module itself is composed of three main components:
The usual workflow is to create a PVC that is replicated with a classic Kubernetes directive by just picking the right StorageClass. You can then use repctl or edit the DellCSIReplicationGroup CRD to launch operations like Failover, Failback, Reprotect, Suspend, Synchronize, and so on.
For the first release, Dell EMC PowerMax and Dell EMC PowerStore support CSM Replication.
With CSM Authorization we are giving back more control of storage consumption to the storage administrator.
The authorization module is an independent service, installed and owned by the storage admin.
Within that module, the storage administrator will create access control policies and storage quotas to make sure that Kubernetes consumers are not overconsuming storage or trying to access data that doesn’t belong to them.
CSM Authorization makes multi-tenant architecture real by enforcing Role-Based Access Control on storage objects coming from multiple and independent Kubernetes clusters.
The authorization module acts as a proxy between the CSI driver and the backend array. Access is granted with an access token that can be revoked at any point in time. Quotas can be changed on the fly to limit or increase storage consumption from the different tenants.
For the first release, Dell EMC PowerMax and Dell EMC PowerFlex support CSM Authorization.
When dealing with StatefulApp, if a node goes down, vanilla Kubernetes is pretty conservative.
Indeed, from the Kubernetes control plane, the failing node is seen as not ready. It can be because the node is down, or because of network partitioning between the control plane and the node, or simply because the kubelet is down. In the latter two scenarios, the StatefulApp is still running and possibly writing data on disk. Therefore, Kubernetes won’t take action and lets the admin manually trigger a Pod deletion if desired.
The CSM Resiliency module (sometimes named PodMon) aims to improve that behavior with the help of collected metrics from the array.
Because the driver has access to the storage backend from pretty much all other nodes, we can see the volume status (mapped or not) and its activity (are there IOPS or not). So when a node goes into NotReady state, and we see no IOPS on the volume, Resiliency will relocate the Pod to a new node and clean whatever leftover objects might exist.
The entire process happens in seconds between the moment a node is seen down and the rescheduling of the Pod.
To protect an app with the resiliency module, you only have to add the label podmon.dellemc.com/driver to it, and it is then protected.
For more details on the module’s design, you can check the documentation here.
For the first release, Dell EMC PowerFlex and Dell EMC Unity support CSM Resiliency.
Each module above is released either as an independent helm chart or as an option within the CSI Drivers.
For more complex deployments, which may involve multiple Kubernetes clusters or a mix of modules, it is possible to use the csm installer.
The CSM Installer, built on top of carvel gives the user a single command line to create their CSM-CSI application and to manage them outside the Kubernetes cluster.
For the first release, all drivers and modules support the CSM Installer.
For each driver, this release provides:
VMware Tanzu offers storage management by means of its CNS-CSI driver, but it doesn’t support ReadWriteMany access mode.
If your workload needs concurrent access to the filesystem, you can now rely on CSI Driver for PowerStore, PowerScale and Unity through the NFS protocol. The three platforms are officially supported and qualified on Tanzu.
NFS Driver, PowerStore, PowerScale, and Unity have all been tested and work when the Kubernetes cluster is behind a private network.
By default, the CSI driver creates volumes with 777 POSIX permission on the directory.
Now with the isiVolumePathPermissions parameter, you can use ACLs or any more permissive POSIX rights.
The isiVolumePathPermissions can be configured as part of the ConfigMap with the PowerScale settings or at the StorageClass level. The accepted parameter values are: private_read, private, public_read, public_read_write, and public for the ACL or any combination of [POSIX Mode].
For more details you can:
Author: Florian Coulombel
Tue, 12 Dec 2023 18:16:57 -0000
|Read Time: 0 minutes
Kubernetes has become a pivotal technology in managing containerized applications, but it's not without its challenges, particularly when dealing with Stateful Apps and non-graceful shutdown scenarios. This article delves into the intricacies of handling such situations, drawing insights from Dell Technologies' expertise and more importantly, how to enable it.
Understanding Graceful vs. Non-Graceful Node Shutdowns in Kubernetes
A 'graceful' node shutdown in Kubernetes is an orchestrated process. When kubelet detects a node shutdown event, it terminates the pods on that node properly, releasing resources before the actual shutdown. This orderly process allows critical pods to be terminated after regular pods, ensuring an application continues operating as long as possible. This process is vital for maintaining high availability and resilience in applications.
However, issues arise with a non-graceful shutdown, like a hard stop or node crash. In such cases, kubelet fails to detect a clean shutdown event. This leads to Kubernetes marking the node ‘NotReady', and Pods in a Stateful Set can remain stuck in 'Terminating' mode indefinitely!
Kubernetes adopts a cautious approach in these scenarios since it cannot ascertain if the issue is a total node failure, a kubelet problem, or a network glitch. This distinction is critical, especially for stateful apps, where rescheduling amidst active data writing could lead to severe data corruption.
Role of Dell's Container Storage Module (CSM) for Resiliency
Dell's CSM for Resiliency plays a crucial role in automating decision-making in these complex scenarios, aiming to minimize manual intervention and maximize uptime. The module's functionality is highlighted through a typical workflow:
The following tutorial allow to test the functionality live: https://dell.github.io/csm-docs/docs/interactive-tutorials/
How to enable the module ?
To take advantage of the CSM resiliency you need two things:
Managing non-graceful shutdowns in Kubernetes, particularly for stateful applications, is a complex but essential aspect of ensuring system resilience and data integrity.
Tools like Dell's CSM for Resiliency are instrumental in navigating these challenges, offering automated, intelligent solutions that keep applications running smoothly even in the face of unexpected failures.
Stay informed of the latest updates of Dell CSM eco-system by subscribing to:
* The Dell CSM Github repository
* Our DevOps & Automation Youtube playlist
* The Slack
Mon, 02 Oct 2023 13:21:45 -0000
|Read Time: 0 minutes
When I was a customer, I consistently evaluated how to grow the technical influence of the mainframe platform. If I were talking about the financials of the platform, I would evaluate the total cost of ownership (TCO) alongside various IT solutions and the value deduced thereof. If discussing existing technical pain points, I would evaluate technical solutions that may alleviate the issue.
For example, when challenged with finding a solution for a client organization aiming to refresh various x86 servers, I searched online presentations, YouTube videos, and technical websites for a spark. The client organization had already identified the pain point. The hard part was how.
Over time, I found the ability to run Linux on a mainframe (called Linux on Z), using an Integrated Facility for Linux (IFL) engine. Once the idea was formed, I started baking the cake. I created a proof-of-concept environment installing Linux and a couple of applications and began testing.
The light-bulb moment came not in resolving the original pain point, but in discovering new opportunities I had not originally thought of. More specifically:
With the 2023 addition of Kubernetes on LinuxOne (mainframe that only runs Linux), you can scale, reduce TCO, and build that hybrid cloud your IT management requires. With Kubernetes providing container orchestration irrelevant of the underlying hardware and architecture, you can leverage the benefits of LinuxOne to deploy your applications in a structured fashion.
Benefits when deploying Kubernetes to Linux on Z may include:
With Dell providing storage to mainframe environments with PowerMax 8500/2500, a Container Storage Interface (CSI) was created to simplify your experience with allocating storage to Kubernetes environments when using Linux on Z with Kubernetes.
The remaining content will focus on the CSI for PowerMax. Continue reading to explore what’s possible.
Linux on IBM Z runs on s390x architecture. This means that all the software we use needs to be compiled with that architecture in mind.
Luckily, Kubernetes, CSI sidecars, and Dell CSI drivers are built in Golang. Since the early days of Go, the portability and support of different OS and architectures has been one of the goals of the project. You can get the list of compatible OS and architecture with your go version using the command:
go tool dist list
The easiest and most straightforward way of trying Kubernetes on LinuxOne is by using the k3s distro. It installs with the following one-liner:
curl -sfL https://get.k3s.io | sh -
The Dell CSI Driver for PowerMax is composed of a container to run all actions against Unisphere and mount a LUN to a pod, with a set of official CSI sidecars to interact with Kubernetes calls.
The Kubernetes official sidecars are published for multiple architectures including s390x while Dell publishes only images for x86_64.
To build the driver, we will first build the binary and then the image.
First, let’s clone the driver from https://github.com/dell/csi-powermax in your GOPATH. To build the driver, go in the directory and just execute:
CGO_ENABLED=0 GOOS=linux GOARCH=s390x GO111MODULE=on go build
At the end of the build, you must have a single binary with static libs compiled for the s390x:
file csi-powermax
csi-powermax: ELF 64-bit MSB executable, IBM S/390, version 1 (SYSV), statically linked, Go BuildID=…, with debug_info, not stripped
The distributed driver uses minimal Red Hat Universal Base Image. There is no s390x compatible UBI image. Therefore, we need to rebuild the container image from a Fedora base-image.
The following is the Dockerfile:
# Dockerfile to build PowerMax CSI Driver
FROM docker.io/fedora:37
# dependencies, following by cleaning the cache
RUN yum install -y \
util-linux \
e2fsprogs \
which \
xfsprogs \
device-mapper-multipath \
&& \
yum clean all \
&& \
rm -rf /var/cache/run
# validate some cli utilities are found
RUN which mkfs.ext4
RUN which mkfs.xfs
COPY "csi-powermax" .
COPY "csi-powermax.sh" .
ENTRYPOINT ["/csi-powermax.sh"]
We can now build our container image with the help of docker buildx, which makes building cross-architecture a breeze:
docker buildx build -o type=registry -t coulof/csi-powermax:v2.8.0 --platform=linux/s390x -f Dockerfile.s390x .
The last step is to change the image in the helm chart to point to the new one: https://github.com/dell/helm-charts/blob/main/charts/csi-powermax/values.yaml
Et voilà! Everything else is the same as with a regular CSI driver.
Thanks to the open-source model of Kubernetes and Dell CSM, it’s easy to build and utilize them for many different architectures.
The CSI driver for PowerMax supports FBA devices via Fiber Channel and iSCSI. There is no support for CKD devices which require code changes.
The CSI driver for PowerMax allows CSI-compliant calls.
Note: Dell officially supports (through Github tickets, Service Requests, and Slack) the image and binary, but not the custom build.
Stay informed of the latest updates of the Dell CSM eco-system by subscribing to:
Authors: Justin Bastin & Florian Coulombel
Fri, 22 Sep 2023 21:29:12 -0000
|Read Time: 0 minutes
This is already the third release of Dell Container Storage Modules (CSM)!
The official changelog is available in the CHANGELOG directory of the CSM repository.
The newly supported Kubernetes distributions are :
Historically, PowerMax and PowerFlex are Dell’s high-end and SDS for block storage. Both of these backends recently introduced support for software defined NAS.
This means that the respective CSI drivers can now provision PVC with the ReadWriteMany access mode for the volume type file. In other words, thanks to the NFS protocol different nodes from the Kubernetes cluster can access the same volume concurrently. This feature is particularly useful for applications, such as log management tools like Splunk or Elastic Search, that need to process logs coming from multiple Pods.
Like PowerScale in v1.7.0, PowerMax and Dell Unity allow you to check the storage capacity on a node before deploying storage to that node. This isn't that relevant in the case of shared storage, because shared storage generally will always show the same capacity to each node in the cluster. However, it could prove useful if the array lacks available storage.
Using this feature, an object from the CSIStorageCapacity type is created by the CSI driver in the same namespace as the CSI driver, one for each storageClass.
An example:
kubectl get csistoragecapacities -n unity # This shows one object per storageClass.
The Volume Limits feature is added to both PowerStore and PowerFlex. All Dell storage platforms now implement this feature.
This option limits the maximum number of volumes to which a Kubernetes worker node can connect. This can be configured on a per-node basis, or cluster-wide. Setting this variable to zero disables the limit.
Here are some PowerStore examples.
Per node:
kubectl label node <node name> max-powerstore-volumes-per-node=<volume_limit>
For the entire cluster (all worker nodes):
Specify maxPowerstoreVolumesPerNode or maxVxflexVolumesPerNode in the values.yaml file upon Helm installation.
If you opted-in for the CSP Operator deployment, you can control it by specifying X_CSI_MAX_VOLUMES_PER_NODES in the CRD.
Stay informed of the latest updates of the Dell CSM eco-system by subscribing to:
Author: Florian Coulombel
Fri, 30 Jun 2023 13:42:36 -0000
|Read Time: 0 minutes
The second release of 2023 for Kubernetes CSI Driver & Dell Container Storage Modules (CSM) is here!
The official changelog is available in the CHANGELOG directory of the CSM repository.
As you may know, Dell Container Storage Modules (CSM) bring powerful enterprise storage features and functionality to your Kubernetes workloads running on Dell primary storage arrays, and provide easier adoption of cloud native workloads, improved productivity, and scalable operations. Read on to learn more about what’s in this latest release.
The newly supported Kubernetes distributions are:
For the last couple of versions, the CSI PowerMax reverseproxy is enabled by default. The TLS certificate secret creation is now pre-packaged using cert-manager, to avoid manual steps for the administrator.
A volume can be mounted to a Pod as `readOnly`. This is the default behavior for a `configMap` or `secret`. That option is now also supported for RawBlock devices.
apiVersion: v1 kind: Pod metadata: name: task-pv-pod spec: volumes: - name: task-pv-storage persistentVolumeClaim: claimName: task-pv-claim # What ever is the accessMode it will be read-only for the Pod readOnly: true ...
CSM v1.5 introduced the capacity to provision Fibre Channel LUNs to Kubernetes worker nodes through VMware Raw Device Mapping. One limitation of the RDM/LUN was that it was sticky to a single ESXi host, meaning that the Pod could not move to another worker node.
The auto-RDM feature works at the HostGroup level in PowerMax and therefore supports clusters with multiple ESXi hosts.
We are exposing the host I/O limits on the storage groups parameter using the StorageClass. The Host I/O limit is here to implement QoS at the worker node level and to prevent any noisy neighbor behavior.
Storage Capacity Tracking is used by the Kubernetes scheduler to make sure that the node and backend storage have enough capacity for Pod/PVC.
The user can now set Quota limit parameters from the PVC and StorageClass requests. This allows the user to have better control of the quota parameters (including Soft Limit, AdvisoryLimit, Softgrace period) attached to each PVC
The PVC settings take precedence if quota limit values are specified in both StorageClass and PVC.
One can now use the CSM Operator to install Dell Unity and PowerMax CSI drivers and affiliated modules.
The CSM Operator now provides CSM resiliency and CSM-Replication for CSI-PowerFlex.
A detailed matrix of supported CSM components is available here.
The CSM Installation Wizard is the easiest and most straight forward way to install the Dell CSI drivers and Container Storage Modules.
In this release, we are adding support for Dell Unity, PowerScale, and PowerFlex.
To keep it simple, we removed the option to install the driver and modules in separate namespaces.
In this release of CSM, Secrets Encryption is enabled by default.
When you use CSM replication, two volumes are created: the active volume and the replica. Prior to CSM v1.7, if you removed the two PVs, the physical replica wasn't deleted.
Now on PV deletion, we cascade the removal to all objects, including the replica block volumes in PowerStore, PowerMax, and PowerFlex, so that there are no more orphan volumes.
Stay informed of the latest updates of the Dell CSM eco-system by subscribing to:
Author: Florian Coulombel
Mon, 20 Mar 2023 14:03:34 -0000
|Read Time: 0 minutes
HashiCorp’s Terraform enables DevOps organizations to provision, configure and modify infrastructure using human-readable configuration files or plans written in HashiCorp Configuration Language (HCL). Information required to configure various infrastructure components are provided within pre-built Terraform providers so that the end user can easily discover the infrastructure properties that can be used to effect configuration changes. The configuration files can be versioned, reused, and shared, enabling more consistent workflows for managing infrastructure. These configurations, when executed, change the state of the infrastructure to bring it to the desired state. The idempotency feature of Terraform ensures that only the necessary changes are made to the infrastructure to reach the desired state even when the same configuration is run multiple times, thereby avoiding unwanted drift of infrastructure state.
Today we are announcing the availability of the following Terraform providers for the Dell infrastructure portfolio:
Code in Terraform files is organized as distinct code blocks and is declarative in style to declare the various components of infrastructure. This is very much in contrast with a sequence of steps to be executed in a typical imperative style programming or scripting language. In the simplest of terms, a declarative approach provides the end state or result rather than the step-by- step process. Here are the main elements used as building blocks to define various infrastructure components in a Terraform project:
These elements are organized into different .tf files in a way that is suitable for the project. However, as a norm, Terraform projects are organized with the following files in the project root directory or a module directory:
Following are the details of the resources and data sources that come with the different providers for Dell infrastructure:
Resources | Data sources | |
---|---|---|
PowerFlex |
|
|
PowerStore |
|
|
PowerMax |
|
|
OpenManage Enterprise |
|
|
We invite you to check out the following videos to get started!
Wed, 15 Mar 2023 14:41:14 -0000
|Read Time: 0 minutes
Some time ago, I faced a bug where it was important to understand the precise workflow.
One of the beauties of open source is that the user can also take the pilot seat!
In this post, we will see how to compile the Dell CSI driver for PowerFlex with a debugger, configure the driver to allow remote debugging, and attach an IDE.
First, it is important to know that Dell and RedHat are partners, and all CSI/CSM containers are certified by RedHat.
This comes with a couple of constraints, one being that all containers use the Red Hat UBI Minimal image as a base image and, to be certified, extra packages must come from a Red Hat official repo.
CSI PowerFlex needs the e4fsprogs package to format file systems in ext4, and that package is missing from the default UBI repo. To install it, you have these options:
Here we’ll use an Oracle Linux mirror, which allows us to access binary-compatible packages without the need for registration or payment of a Satellite subscription.
The Oracle Linux 8 repo is:
[oracle-linux-8-baseos] name=Oracle Linux 8 - BaseOS baseurl=http://yum.oracle.com/repo/OracleLinux/OL8/baseos/latest/x86_64 gpgcheck = 0 enabled = 1
And we add it to final image in the Dockerfile with a COPY directive:
# Stage to build the driver image
FROM $BASEIMAGE@${DIGEST} AS final
# install necessary packages
# alphabetical order for easier maintenance
COPY ol-8-base.repo /etc/yum.repos.d/
RUN microdnf update -y && \
...
There are several debugger options available for Go. You can use the venerable GDB, a native solution like Delve, or an integrated debugger in your favorite IDE.
For our purposes, we prefer to use Delve because it allows us to connect to a remote Kubernetes cluster.
Our Dockerfile employs a multi-staged build approach. The first stage is for building (and named builder) from the Golang image; we can add Delve with the directive:
RUN go install github.com/go-delve/delve/cmd/dlv@latest
And then compile the driver.
On the final image that is our driver, we add the binary as follows:
# copy in the driver COPY --from=builder /go/src/csi-vxflexos / COPY --from=builder /go/bin/dlv /
In the build stage, we download Delve with:
RUN go get github.com/go-delve/delve/cmd/dlv
In the final image we copy the binary with:
COPY --from=builder /go/bin/dlv /
To achieve better results with the debugger, it is important to disable optimizations when compiling the code.
This is done in the Makefile with:
CGO_ENABLED=0 GOOS=linux GO111MODULE=on go build -gcflags "all=-N -l"
After rebuilding the image with make docker and pushing it to your registry, you need to expose the Delve port for the driver container. You can do this by adding the following lines to your Helm chart. We need to add the lines to the driver container of the Controller Deployment.
ports: - containerPort: 40000
Alternatively, you can use the kubectl edit -n powerflex deployment command to modify the Kubernetes deployment directly.
Assuming that the build has been completed successfully and the driver is deployed on the cluster, we can expose the debugger socket locally by running the following command:
kubectl port-forward -n powerflex pod/csi-powerflex-controller-uid 40000:40000
Next, we can open the project in our favorite IDE and ensure that we are on the same branch that was used to build the driver.
In the following screenshot I used Goland, but VSCode can do remote debugging too.
We can now connect the IDE to that forwarded socket and run the debugger live:
And here is the result of a breakpoint on CreateVolume call:
The full code is here: https://github.com/dell/csi-powerflex/compare/main...coulof:csi-powerflex:v2.5.0-delve.
If you liked this information and need more deep-dive details on Dell CSI and CSM, feel free to reach out at https://dell-iac.slack.com.
Author: Florian Coulombel
Mon, 13 Mar 2023 19:06:05 -0000
|Read Time: 0 minutes
There are a number of tools for managing your Infrastructure as Code, from Basic REST API commands that you can script together in the language of your choice to more sophisticated engine tools like Ansible, Terraform, Chef, SaltStack or Cloud Formation.
Dell already provides comprehensive support for REST API and Ansible Collections for Dell storage arrays and is now releasing providers for Terraform for server and storage products. (A Terraform provider is a plugin that enables Terraform to interact with the vendor API.) Initially Dell will publish providers for PowerMax, PowerStore, and PowerFlex storage on Terraform registry to enable user access to published resources to manage these storage arrays.
Terraform is an open-source infrastructure-as-code software tool created by HashiCorp. In Terraform, users define data center infrastructure using a declarative configuration language known as HashiCorp Configuration Language (HCL), which is relatively simple and similar to YAML. Terraform encourages a declarative style where you write code that describes your desired end state of your configuration, and Terraform figures out how to get to that end state. Terraform is also aware of any state it created in the past as it tracks whether the configuration is a state file stored locally or in version control.
A Terraform configuration is a complete document in the Terraform language that tells Terraform how to manage a given collection of infrastructure. A configuration can consist of multiple files and directories. This blog takes you through a basic configuration with the PowerMax provider.
(Note: Sample code is published on the Dell GitHub page where the Terraform provider is hosted. This first PowerMax provider for Terraform concentrates on storage provisioning operations, creating masking views, and managing the storage volumes for your applications. More features will come online with later releases based on customer feedback.)
Before configuring anything, it is important to note that the Terraform provider will communicate with Unisphere for PowerMax using REST. At a minimum you will need a user account with storage administrator privileges for the arrays that you need to manage.
To start working with Terraform you will need to install Terraform. See Terraform guides for official documentation. In my case, the host was Red Hat so I simply ran
yum install terraform
After you have installed Terraform, you need to set up any third-party providers you will work with. These are located on the Terraform registry (think of it as an Appstore).
To install the PowerMax provider, copy and paste the code snippet from the Use Provider link for your Terraform configuration file. For example:
terraform { required_providers { powermax = { source = "dell/powermax" version = "1.0.0-beta" } } } provider "powermax" { # Configuration options }
In my case, I have a flat directory structure with a few files in it. The first file is provider.tf that contains this text shown here.
When the provider file has the required code for the vendor providers, run
terraform init
At this point, the system is set up and ready to run Terraform with PowerMax storage.
With Terraform installed and and the provider set up, we now need to explore the other files we’ll need to manage a configuration with Terraform.
All Terraform configurations store configuration in a state file (usually terraform.tfstate). This file keeps track of configuration information about managed objects and is used for the idempotency features of Terraform configurations. The state file can be local to the Terraform host but if you have multiple users, or if you are using automation and CI/CD pipelines to run Terraform, the state file needs to be accessible, place the state file on shared storage. Grant permissions as needed. Here’s what my state file looks like, pointing to a shared storage location on an S3 bucket:
Now that we’ve set up a shared state file, I can create my configurations for managing my PowerMax Storage configurations.
There are three stages to creating a configuration with Terraform:
(Image credit https://developer.hashicorp.com/terraform/intro.)
To write configuration files for PowerMax infrastructure, you can use the sample code snippets for each of the resources, available on the Terraform registry or on the Dell GitHub for the Terraform provider. You can copy and customize the code to meet your requirements.
In the following example configuration, the file defines resources for the storage group, volumes, masking view, port group, host, and host group. The configuration also defines some VMware resources to create a datastore from the newly configured PowerMax device.
resource "powermax_storage_group" "tmevcenter_sg" { name = "tmevcenter_sg" srpid = "SRP_1" service_level = "Diamond" } resource "powermax_host_group" "BETA_CLUSTER" { name ="BETA_CLUSTER" host_flags = {} host_ids = ["DELL52", "DELL55"] } resource "powermax_host" "DELL52" { name = "DELL52" initiators = [ "100000109b56a004", "100000109b56a007"] host_flags = {} } resource "powermax_host" "DELL55" { name = "DELL55" initiators = [ "100000109b56a016", "100000109b56a0ca"] host_flags = {} } resource "powermax_port_group" "tmevcenter_pg" { name = "tmevcenter_pg" protocol = "SCSI_FC" ports = [ { director_id = "OR-1C" port_id = "0" }, { director_id = "OR-2C" port_id = "0" }, { director_id = "OR-2C" port_id = "1" }, { director_id = "OR-2C" port_id = "1" } ] } resource "powermax_volume" "volume_1" { name = "vcenter_ds_by_terraform_volume_1" size = 20 cap_unit = "GB" sg_name = "tmevcenter_sg" enable_mobility_id = false } resource "powermax_masking_view" "tmevcenter_mv" { name ="tmevcenter_mv" storage_group_id = powermax_storage_group.tmevcenter_sg.id port_group_id = powermax_port_group.tmevcenter_pg.id host_group_id = powermax_host_group.BETA_CLUSTER.id } data "vsphere_vmfs_disks" "available" { host_system_id = data.vsphere_host.main_esxi_host.id rescan = true filter = "naa" } resource "vsphere_vmfs_datastore" "datastore" { name = "terraform-test" host_system_id = data.vsphere_host.main_esxi_host.id disks = ["naa.${lower(powermax_volume.volume_1.wwn)}"] }
Running the plan command from the configuration directory will output any changes needed on the PowerMax array and vCenter without executing. You can compare the plan against your change requests to ensure that it will produce the expected results.
terraform plan
The following output from the terraform plan command shows objects that will be created by applying the plan outlined in the configuration.
After creating the plan, we get a summary of the output. In this case, Terraform will add five objects and create the datastore, storage group, volumes, port group, and masking view.
If you are working with existing objects, you must import them into the Terraform state file before applying and executing your configuration. To do this, run terraform import command.
For example, to import a host group resource called MY_CLUSTER, specify:
terraform import powermax_host_group.MY_CLUSTER MY_CLUSTER
To view the state of any managed object in your state file, you can check it with the terraform state show command, as shown here:
Executing the plan with the apply command runs the configuration changes:
terraform apply
As I mentioned earlier, this is the first installment of the Terraform provider for PowerMax. As you can see, the main functionality is around the provisioning of storage. In future releases we’ll add more functionality.
To provide any feedback, use the issues section on the GitHub. If you are already using Terraform to manage your configuration, the PowerMax provider will no doubt prove useful in assisting your automation journey!
Authors: Paul Martin, Florian Coulombel
Fri, 27 Jan 2023 18:41:52 -0000
|Read Time: 0 minutes
What is REST API?
REST stands for Representational State Transfer, and it is an architectural style for building APIs, or application programming interfaces. JSON stands for JavaScript Object Notation, a lightweight format for storing and transmitting data between a server and a client application. REST and JSON are popular technologies in building web APIs.
.
The server interface for a REST API is organized as resources that can be accessed through a uniform resource identifier (URI) to access resources and perform actions. HTTP methods like GET and PUT perform one of CRUD: CREATE, READ, UPDATE and DELETE operations on the resources.
REST API calls can be used from almost any modern programming language with the following HTTP methods to communicate with the web server:
In response to the API calls, the web service provides a response in JSON format. JSON is a lightweight, human-readable format for representing structured data. The response includes the status of the call, information requested, and any errors using specific codes. This response is further parsed and processed by the client application.
Here is how a simple GET request looks like when used on a shell CLI with the CURL command:
The JSON response to a call is provided in a <name>:<value> format:
"servers": [ { "id": 123, "name": "alice" }, { "id": 456, "name": "bob" } ] }
JSON supports nested structures, which allow an object or array to contain other objects and arrays. For example, consider the following JSON data:
{ "name": "John Doe", "age": 35, "address": { "street": "123 Main St", "city": "New York", "state": "NY" }, "phoneNumbers": [ { "type": "home", "number": "212-555-1212" }, { "type": "office", "number": "646-555-1212" } ] }
In a REST API, HTTP status codes indicate the outcome of an API request. Here are some common HTTP status codes that a REST API might return, along with their meanings and suggestions for how a client application could handle them:
Client applications must handle these different HTTP status codes properly to provide a good user experience. For example, if a client receives a 404 Not Found error, it could display a message to the user indicating that the requested resource was not found, rather than just displaying an empty screen.
There are several popular authentication mechanisms for REST APIs, including:
1. Basic authentication: This simple authentication scheme uses a username and password to authenticate a user. The username and password are typically sent in the request header.
curl -X GET 'https://api.example.com/server' \ -H 'Content-Type: application/json' \ -H 'Authorization: Basic dXNlcm5hbWU6cGFzc3dvcmQ='
In this example, the Authorization header is set to Basic dXNlcm5hbWU6cGFzc3dvcmQ=, where dXNlcm5hbWU6cGFzc3dvcmQ= is the base64-encoded representation of the string username:password.
2. Token-based authentication: In this scheme, the client exchanges a username and password for a token. The token is then included in subsequent requests to authenticate the user.
curl -X GET 'https://api.example.com/users' \ -H 'Content-Type: application/json' \ -H 'Authorization: Bearer abc123'
In this example, the Authorization header is set to Bearer abc123, where abc123 is the token that was issued to the client.
3. OAuth: This open-standard authorization framework provides a way for users to authorize access to APIs securely. OAuth involves a client, a resource server, and an authorization server.
4. OpenID Connect: This is a protocol built on top of OAuth 2.0 that provides a way to authenticate users using a third-party service, such as Google or Facebook.
The Dell Technologies infrastructure portfolio has extensive APIs covering all IT infrastructure operations. You can learn more about the API implementation of the different Dell infrastructure products on the Info Hub:
You can also explore Dell infrastructure APIs by visiting the API documentation portal: https://developer.dell.com/apis.
Authors: Florian Coulombel and Parasar Kodati
Thu, 26 Jan 2023 19:04:30 -0000
|Read Time: 0 minutes
One of the first things I do after deploying a Kubernetes cluster is to install a CSI driver to provide persistent storage to my workloads; coupled with a GitOps workflow; it takes only seconds to be able to run stateful workloads.
The GitOps process is nothing more than a few principles:
Nonetheless, to ensure that the process runs smoothly, you must make certain that the application you will manage with GitOps complies with these principles.
This article describes how to use the Microsoft Azure Arc GitOps solution to deploy the Dell CSI driver for Dell PowerMax and affiliated Container Storage Modules (CSMs).
The platform we will use to implement the GitOps workflow is Azure Arc with GitHub. Still, other solutions are possible using Kubernetes agents such as Argo CD, Flux CD, and GitLab.
Azure GitOps itself is built on top of Flux CD.
The first step is to onboard your existing Kubernetes cluster within the Azure portal.
Obviously, the Azure agent will connect to the Internet. In my case, the installation of the Arc agent fails from the Dell network with the error described here: https://docs.microsoft.com/en-us/answers/questions/734383/connect-openshift-cluster-to-azure-arc-secret-34ku.html
Certain URLs (even when bypassing the corporate proxy) don't play well when communicating with Azure. I have seen some services get a self-signed certificate, causing the issue.
The solution for me was to put an intermediate transparent proxy between the Kubernetes cluster and the corporate cluster. That way, we can have better control over the responses given by the proxy.
In this example, we install Squid on a dedicated box with the help of Docker. To make it work, I used the Squid image by Ubuntu and made sure that Kubernetes requests were direct with the help of always_direct:
docker run -d --name squid-container ubuntu/squid:5.2-22.04_beta ; docker cp squid-container:/etc/squid/squid.conf ./ ; egrep -v '^#' squid.conf > my_squid.conf docker rm -f squid-container
Then add the following section:
acl k8s port 6443 # k8s https always_direct allow k8s
You can now install the agent per the following instructions: https://docs.microsoft.com/en-us/azure/azure-arc/kubernetes/quickstart-connect-cluster?tabs=azure-cli#connect-using-an-outbound-proxy-server.
export HTTP_PROXY=http://mysquid-proxy.dell.com:3128 export HTTPS_PROXY=http://mysquid-proxy.dell.com:3128 export NO_PROXY=https://kubernetes.local:6443 az connectedk8s connect --name AzureArcCorkDevCluster \ --resource-group AzureArcTestFlorian \ --proxy-https http://mysquid-proxy.dell.com:3128 \ --proxy-http http://mysquid-proxy.dell.com:3128 \ --proxy-skip-range 10.0.0.0/8,kubernetes.default.svc,.svc.cluster.local,.svc \ --proxy-cert /etc/ssl/certs/ca-bundle.crt
If everything worked well, you should see the cluster with detailed info from the Azure portal:
To benefit from all the features that Azure Arc offers, give the agent the privileges to access the cluster.
The first step is to create a service account:
kubectl create serviceaccount azure-user kubectl create clusterrolebinding demo-user-binding --clusterrole cluster-admin --serviceaccount default:azure-user kubectl apply -f - <<EOF apiVersion: v1 kind: Secret metadata: name: azure-user-secret annotations: kubernetes.io/service-account.name: azure-user type: kubernetes.io/service-account-token EOF
Then, from the Azure UI, when you are prompted to give a token, you can obtain it as follows:
kubectl get secret azure-user-secret -o jsonpath='{$.data.token}' | base64 -d | sed $'s/$/\\\n/g'
Then paste the token in the Azure UI.
The GitOps agent installation can be done with a CLI or in the Azure portal.
As of now, the Microsoft documentation presents in detail the deployment that uses the CLI; so let's see how it works with the Azure portal:
The Git repository organization is a crucial part of the GitOps architecture. It hugely depends on how internal teams are organized, the level of information you want to expose and share, the location of the different clusters, and so on.
In our case, the requirement is to connect multiple Kubernetes clusters owned by different teams to a couple of PowerMax systems using only the latest and greatest CSI driver and affiliated CSM for PowerMax.
Therefore, the monorepo approach is suited.
The organization follows this structure:
.
├── apps
│ ├── base
│ └── overlays
│ ├── cork-development
│ │ ├── dev-ns
│ │ └── prod-ns
│ └── cork-production
│ └── prod-ns
├── clusters
│ ├── cork-development
│ └── cork-production
└── infrastructure
├── cert-manager
├── csm-replication
├── external-snapshotter
└── powermax
You can see all files in https://github.com/coulof/fluxcd-csm-powermax.
Note: The GitOps agent comes with multi-tenancy support; therefore, we cannot cross-reference objects between namespaces. The Kustomization and HelmRelease must be created in the same namespace as the agent (here, flux-system) and have a corresponding targetNamespace to the resource to be installed.
This article is the first of a series exploring the GitOps workflow. Next, we will see how to manage application and persistent storage with the GitOps workflow, how to upgrade the modules, and so on.
Thu, 12 Jan 2023 19:27:23 -0000
|Read Time: 0 minutes
Made available on December 20th, 2022, the 1.5 release of our flagship cloud-native storage management products, Dell CSI Drivers and Dell Container Storage Modules (CSM), is here!
See the official changelog in the CHANGELOG directory of the CSM repository.
First, this release extends support for Red Hat OpenShift 4.11 and Kubernetes 1.25 to every CSI Driver and Container Storage Module.
Featured in the previous CSM release (1.4), avid customers may recall a few new additions to the portfolio made available in tech preview. Primarily:
Building on these three new modules, Dell Technologies is adding deeper capabilities and major improvements as part of today’s 1.5 release for CSM, including:
For the platform updates included in today’s 1.5 release, the major new features are:
This feature is named “Auto RDM over FC” in the CSI/CSM documentation.
The concept is that the CSI driver will connect to both Unisphere and vSphere API to create the respective objects.
When deployed with “Auto-RDM” the driver can only function in that mode. It is not possible to combine iSCSI and FC access within the same driver installation.
The same limitation applies for RDM usage. You can learn more about it at RDM Considerations and Limitations on the VMware website.
That’s all for CSM 1.5! Feel free to share feedback or send questions to the Dell team on Slack: https://dell-csm.slack.com.
Author: Florian Coulombel
Fri, 23 Dec 2022 21:50:39 -0000
|Read Time: 0 minutes
Velero is one of the most popular tools for backup and restore of Kubernetes resources.
You can use Velero for different backup options to protect your Kubernetes cluster. The three modes are:
In all cases, Velero syncs the information (YAML and restic data) to a storage object.
PowerScale is Dell Technologies’ leading scale-out NAS solution. It supports many different access protocols including NFS, SMB, HTTP, FTP, HDFS, and, in the case that interests us, S3!
Note: PowerScale is not 100% compatible with the AWS S3 protocol (for details, see the PowerScale OneFS S3 API Guide).
For a simple backup solution of a few terabytes of Kubernetes data, PowerScale and Velero are a perfect duo.
To deploy this solution, you need to configure PowerScale and then install and configure Velero.
Prepare PowerScale to be a target for the backup as follows:
You can check that in the UI under Protocols > Object Storage (S3) > Global Settings or in the CLI.
In the UI:
In the CLI:
PS1-1% isi s3 settings global view HTTP Port: 9020 HTTPS Port: 9021 HTTPS only: No S3 Service Enabled: Yes
2. Create a bucket with the permission to write objects (at a minimum).
That action can also be done from the UI or CLI.
In the UI:
In the CLI:
See isi S3 buckets create in the PowerScale OneFS CLI Command Reference.
3. Create a key for the user that will be used to upload the objects.
Important notes:
Now that PowerScale is ready, we can proceed with the Velero deployment.
We assume that the Velero binary is installed and has access to the Kubernetes cluster. If not, see the Velero installation document for the deployment instructions.
Configure Velero:
$ cat ~/credentials-velero [default] aws_access_key_id = 1_admin_accid aws_secret_access_key = 0**************************i …
$ velero install \ --provider aws \ --plugins velero/velero-plugin-for-aws:v1.5.1 \ --bucket velero-backup \ --secret-file ./credentials-velero \ --use-volume-snapshots=false \ --cacert ./ps2-cacert.pem \ --backup-location-config region=powerscale,s3ForcePathStyle="true",s3Url=https://192.168.1.21:9021 …
The preceding command shows how to use Velero most simplistically and securely.
It is possible to add parameters to enable protection with snapshots. Every Dell CSI driver has snapshot support. To take advantage of that support, we use the install command with this addition:
velero install \ --features=EnableCSI \ --plugins=velero/velero-plugin-for-aws:v1.5.1,velero/velero-plugin-for-csi:v0.3.0 \ --use-volume-snapshots=true ...
Now that CSI snaps are enabled, we can enable restic to move data out of those snapshots into our backup target by adding:
--use-restic
As you can see, we are using the velero/velero-plugin-for-aws:v1.5.1 image, which is the latest available at the time of the publication of this article. You can obtain the current version from GitHub: https://github.com/vmware-tanzu/velero-plugin-for-aws
After the Velero installation is done, check that everything is correct:
kubectl logs -n velero deployment/velero
If you have an error with the certificates, you should see it quickly.
You can now back up and restore your Kubernetes resources with the usual Velero commands. For example, to protect the entire Kubernetes except kube-system, including the data with PV snapshots:
velero backup create backup-all --exclude-namespaces kube-system
You can check the actual content directly from PowerScale file system explorer:
Here is a demo:
Conclusion
For easy protection of small Kubernetes clusters, Velero combined with PowerScale S3 is a great solution. If you are looking for broader features (for a greater amount of data or more destinations that go beyond Kubernetes), look to Dell PowerProtect Data Manager, a next-generation, comprehensive data protection solution.
Interestingly, Dell PowerProtect Data Manager uses the Velero plug-in to protect Kubernetes resources!
Mon, 26 Sep 2022 15:17:45 -0000
|Read Time: 0 minutes
One of the first things I do after deploying a Kubernetes cluster is to install a CSI driver to provide persistent storage to my workloads; coupled with a GitOps workflow; it takes only seconds to be able to run stateful workloads.
The GitOps process is nothing more than a few principles:
Nonetheless, to ensure that the process runs smoothly, you must make certain that the application you will manage with GitOps complies with these principles.
This article describes how to use the Microsoft Azure Arc GitOps solution to deploy the Dell CSI driver for Dell PowerMax and affiliated Container Storage Modules (CSMs).
The platform we will use to implement the GitOps workflow is Azure Arc with GitHub. Still, other solutions are possible using Kubernetes agents such as Argo CD, Flux CD, and GitLab.
Azure GitOps itself is built on top of Flux CD.
The first step is to onboard your existing Kubernetes cluster within the Azure portal.
Obviously, the Azure agent will connect to the Internet. In my case, the installation of the Arc agent fails from the Dell network with the error described here: https://docs.microsoft.com/en-us/answers/questions/734383/connect-openshift-cluster-to-azure-arc-secret-34ku.html
Certain URLs (even when bypassing the corporate proxy) don't play well when communicating with Azure. I have seen some services get a self-signed certificate, causing the issue.
The solution for me was to put an intermediate transparent proxy between the Kubernetes cluster and the corporate cluster. That way, we can have better control over the responses given by the proxy.
In this example, we install Squid on a dedicated box with the help of Docker. To make it work, I used the Squid image by Ubuntu and made sure that Kubernetes requests were direct with the help of always_direct:
docker run -d --name squid-container ubuntu/squid:5.2-22.04_beta ; docker cp squid-container:/etc/squid/squid.conf ./ ; egrep -v '^#' squid.conf > my_squid.conf docker rm -f squid-container
Then add the following section:
acl k8s port 6443 # k8s https always_direct allow k8s
You can now install the agent per the following instructions: https://docs.microsoft.com/en-us/azure/azure-arc/kubernetes/quickstart-connect-cluster?tabs=azure-cli#connect-using-an-outbound-proxy-server.
export HTTP_PROXY=http://mysquid-proxy.dell.com:3128 export HTTPS_PROXY=http://mysquid-proxy.dell.com:3128 export NO_PROXY=https://kubernetes.local:6443 az connectedk8s connect --name AzureArcCorkDevCluster \ --resource-group AzureArcTestFlorian \ --proxy-https http://mysquid-proxy.dell.com:3128 \ --proxy-http http://mysquid-proxy.dell.com:3128 \ --proxy-skip-range 10.0.0.0/8,kubernetes.default.svc,.svc.cluster.local,.svc \ --proxy-cert /etc/ssl/certs/ca-bundle.crt
If everything worked well, you should see the cluster with detailed info from the Azure portal:
To benefit from all the features that Azure Arc offers, give the agent the privileges to access the cluster.
The first step is to create a service account:
kubectl create serviceaccount azure-user kubectl create clusterrolebinding demo-user-binding --clusterrole cluster-admin --serviceaccount default:azure-user kubectl apply -f - <<EOF apiVersion: v1 kind: Secret metadata: name: azure-user-secret annotations: kubernetes.io/service-account.name: azure-user type: kubernetes.io/service-account-token EOF
Then, from the Azure UI, when you are prompted to give a token, you can obtain it as follows:
kubectl get secret azure-user-secret -o jsonpath='{$.data.token}' | base64 -d | sed $'s/$/\\\n/g'
Then paste the token in the Azure UI.
The GitOps agent installation can be done with a CLI or in the Azure portal.
As of now, the Microsoft documentation presents in detail the deployment that uses the CLI; so let's see how it works with the Azure portal:
The Git repository organization is a crucial part of the GitOps architecture. It hugely depends on how internal teams are organized, the level of information you want to expose and share, the location of the different clusters, and so on.
In our case, the requirement is to connect multiple Kubernetes clusters owned by different teams to a couple of PowerMax systems using only the latest and greatest CSI driver and affiliated CSM for PowerMax.
Therefore, the monorepo approach is suited.
The organization follows this structure:
.
├── apps
│ ├── base
│ └── overlays
│ ├── cork-development
│ │ ├── dev-ns
│ │ └── prod-ns
│ └── cork-production
│ └── prod-ns
├── clusters
│ ├── cork-development
│ └── cork-production
└── infrastructure
├── cert-manager
├── csm-replication
├── external-snapshotter
└── powermax
You can see all files in https://github.com/coulof/fluxcd-csm-powermax.
Note: The GitOps agent comes with multi-tenancy support; therefore, we cannot cross-reference objects between namespaces. The Kustomization and HelmRelease must be created in the same namespace as the agent (here, flux-system) and have a corresponding targetNamespace to the resource to be installed.
This article is the first of a series exploring the GitOps workflow. Next, we will see how to manage application and persistent storage with the GitOps workflow, how to upgrade the modules, and so on.