CloudLink 7.1: Simplifying datacenter security
Fri, 23 Apr 2021 12:10:59 -0000|
Read Time: 0 minutes
Are you feeling safe about the security of your data center’s infrastructure? Chances are, you aren’t. According to a recent poll1, 74% of customers report experiencing some form of cyber attack in the last twelve months, and 86% were concerned about potential cyberattacks. Clearly, data center security is a topic than can no longer be ignored - and most of our customers are taking steps to ensure their data is safe. Yet even though it’s necessary, adopting data center security can be confusing, complex, and difficult to implement.
Dell EMC CloudLink aides our customers in this effort by being reliable, flexible, and easy to use. Our 7.1 release adds new tools to our toolbox including shallow rekey for our Container based encryption, support for vVols encryption and IPv6 only environments, and the new Secure Configuration Summary page designed to make security audits of CloudLink a breeze.
Every security related framework published discusses the need for regular monitoring and assessment of implemented security controls to ensure that the products and deployment are meeting relevant industry standards. Such activities usually include the dreaded yearly security audit. Datacenter administrators view this effort with disfavor because it takes time out of their already busy schedule to walk through the deployment with the auditor to prove compliance.
In the past we’ve heard from our customers that the CloudLink GUI is easy enough to navigate that security audit reviews weren’t too painful, but they occasionally expressed that it would be nice to make them a little bit easier. Well we heard their requests loud and clear and have obliged with the Secure Configuration Summary. We’ve gathered the information commonly requested during security audits onto one page so when the security administrator and auditor go to CloudLink for a review, it’s a one stop shop.
With audits though, simply viewing configuration settings isn’t enough as most auditors require tangible proof to attach to their reports. Screen shots work but we offer something better – the ability to export the configuration settings provided on the summary page. As with most of our GUI pages, you can export the Secure Configuration Summary to a handy-dandy spreadsheet which can be presented directly to the auditor. A one click audit review – can it get any easier than that?
Of course, not all audits are the same and some requirements are more extensive than others. To accommodate this eventuality, our summary page provides direct links to the configuration pages for each setting. If an auditor needs more information on a particular configuration, simply jump to the relevant page, review, and download an export if needed.
Encryption is hard and it can be a challenge to understand, implement, and maintain. We understand that most of our customers are not in the datacenter security business. CloudLink strives to make data encryption in the datacenter a simple, set it and forget it task, so that our customers can focus on their core business, not on trying to figure out how to keep their data safe – that’s our job.
If you would like to know more about CloudLink and our latest release please visit our website and reach out to your Dell Technologies sales team to ask how we can make data encryption easy for you too.
1 Source: statista.com
Related Blog Posts
PowerFlex and Amazon: Destination EKS Anywhere
Thu, 13 Jan 2022 23:01:50 -0000|
Read Time: 0 minutes
Welcome to your destination. Today Dell Technologies is pleased to share that Amazon Elastic Kubernetes Service (Amazon EKS) Anywhere has been validated on Dell PowerFlex software-defined infrastructure. Amazon EKS Anywhere is a new deployment option for Amazon EKS that enables customers to easily create and operate Kubernetes clusters on-premises while allowing for easy connectivity and portability to Amazon AWS environments. PowerFlex helps customers deliver a flexible deployment solution that scales as needs change with smooth, painless node-by-node expandability, inclusive of compute and storage, in a unified fabric architecture.
Dell Technologies collaborates with a broad ecosystem of public cloud providers to help our customers support multi-cloud environments that help place the right data and applications where it makes the most sense for them. Deploying Amazon EKS Anywhere on Dell Technologies infrastructure streamlines application development and delivery by allowing organizations to easily create and manage on premises Kubernetes clusters.
Across nearly all industries, IT organizations are moving to a more developer-oriented model that requires automated processes, rapid resource delivery, and reliable infrastructure. To drive operational simplicity through Kubernetes orchestration, Amazon EKS Anywhere helps customers automate cluster management, reduce support costs, and eliminate the redundant effort of using multiple open source or 3rd party tools to manage Kubernetes clusters. The combination of automated Kubernetes cluster management with intelligent, automated infrastructure quickly brings organizations to the next stop in their IT Journey, allowing them to provide infrastructure as code and empower their DevOps teams to be the innovation engine for their businesses.
Let us explore Amazon EKS Anywhere on PowerFlex and how it helps you move towards a more developer-oriented model. First, let’s look at the requirements for Amazon EKS Anywhere.
To deploy Amazon EKS Anywhere we will need a PowerFlex environment running VMware vSphere 7.0 or higher. Specifically, our validation used vSphere 7.0.2. We will also need to ensure we have sufficient capacity to deploy 8 to 10 Amazon EKS VMs. Additionally, we will need a network in the vSphere workload cluster with a DHCP service. This network is what the workload VMs will connect to. There are also a few Internet locations that the Amazon EKS administrative VM will need to reach, so that the manifests, OVAs, and Amazon EKS distro can be downloaded. Initial deployments can start with as few as four PowerFlex nodes and grow to meet the expansion needs of storage, compute, or both for scalability of over 1,000 nodes.
The logical view of the Amazon EKS Anywhere environment on PowerFlex is illustrated below.
There are two types of templates used for the workloads: a Bottlerocket template and an Ubuntu image. The Bottlerocket template is a customized image from Amazon that is specific to Amazon EKS Anywhere. The Ubuntu template was used for our validation.
Note: Bottlerocket is a Linux-based open-source operating system that is purpose-built by Amazon. It focuses on security and maintainability, and provides a reliable, consistent, and safe platform for container-based workloads. Amazon EKS managed node groups with Bottlerocket support enable you to leverage the simplicity of managed node provisioning and lifecycle management features, while using the latest best practices for running containers in production. You can run your Kubernetes workloads on Bottlerocket nodes and benefit from enhanced security, higher cluster utilization, and less operational overhead. https://aws.amazon.com/blogs/containers/amazon-eks-adds-native-support-for-bottlerocket-in-managed-node-groups/
After the Amazon EKS admin VM is deployed, a command is issued on the Amazon EKS admin VM. This deploys the workload clusters and creates associated CRD instances on the workload cluster. This illustrates the ease of container deployment with Amazon EKS Anywhere. A single instance was prepped, then with some built-in scripting and commands, the system can direct the complex deployment. This greatly simplifies the process when compared to a traditional Kubernetes deployment.
At this point, the deployment can be tested. Amazon provides a test workload that can be used to validate the environment. You can find the details on testing on the Amazon EKS Anywhere documentation site.
The design that was validated was more versatile than a typical Amazon EKS Anywhere deployment. Instead of using the standard VMware CNS-CSI storage provider, this PowerFlex validation uses the Dell PowerFlex CSI plugin. This makes it possible to take direct advantage of PowerFlex’s storage capabilities. With the CSI plugin, it is possible to extend volumes through Amazon EKS, as well as snapshot and restore volumes.
This allows IT departments to move toward developer-oriented processes. Developers can work with storage natively. There are no additional tools to learn and no need to perform operations outside the development environment. This can be a time savings benefit to developer-oriented IT departments.
Beyond storage control in Amazon EKS Anywhere, the results of these operations can be viewed in the PowerFlex management interface. This provides an end-to-end view of the environment and allows traditional IT administrators to use familiar tools to manage and monitor their environment. This makes it easy for the entire IT organization’s journey to move towards a more developer centric environment.
By leveraging Amazon EKS Anywhere on PowerFlex, organizations get on-premises Kubernetes operational tooling that’s consistent with Amazon EKS. Organizations are able to leverage the Amazon EKS console to view all of their Kubernetes clusters (including Amazon EKS Anywhere clusters) running anywhere, through the Amazon EKS Connector. This brings together both the data center and cloud, simplifying the management of both.
In this journey, we have seen that Amazon EKS Anywhere has been validated on Dell PowerFlex, shown how they work together, and enable expanded storage capabilities for developers inside of Amazon EKS Anywhere. It also allows you to use familiar tools in managing the environment. To find out more about Amazon EKS anywhere on PowerFlex, talk with your Dell representative.
Author: Tony Foster, Sr. Technical Marketing Engineer
Deploying Microsoft SQL Server Big Data Clusters on Kubernetes platform using PowerFlex
Wed, 15 Dec 2021 12:20:15 -0000|
Read Time: 0 minutes
Microsoft SQL Server 2019 introduced a groundbreaking data platform with SQL Server 2019 Big Data Clusters (BDC). Microsoft SQL Server Big Data Clusters are designed to solve the big data challenge faced by most organizations today. You can use SQL Server BDC to organize and analyze large volumes of data, you can also combine high value relational data with big data. This blog post describes the deployment of SQL Server BDC on a Kubernetes platform using Dell EMC PowerFlex software-defined storage.
Dell EMC PowerFlex (previously VxFlex OS) is the software foundation of PowerFlex software-defined storage. It is a unified compute storage and networking solution delivering scale-out block storage service that is designed to deliver flexibility, elasticity, and simplicity with predictable high performance and resiliency at scale.
The PowerFlex platform is available in multiple consumption options to help customers meet their project and data center requirements. PowerFlex appliance and PowerFlex rack provide customers comprehensive IT Operations Management (ITOM) and life cycle management (LCM) of the entire infrastructure stack in addition to sophisticated high-performance, scalable, resilient storage services. PowerFlex appliance and PowerFlex rack are the preferred and proactively marketed consumption options. PowerFlex is also available on VxFlex Ready Nodes for those customers who are interested in software-defined compliant hardware without the ITOM and LCM capabilities.
PowerFlex software-define storage with unified compute and networking offers flexibility of deployment architecture to help best meet the specific deployment and architectural requirements. PowerFlex can be deployed in a two-layer for asymmetrical scaling of compute and storage for “right-sizing capacities, single-layer (HCI), or in mixed architecture.
Microsoft SQL Server Big Data Clusters Overview
Microsoft SQL Server Big Data Clusters are designed to address big data challenges in a unique way, BDC solves many traditional challenges through building big-data and data-lake environments. You can query external data sources, store big data in HDFS managed by SQL Server, or query data from multiple external data sources using the cluster.
SQL Server Big Data Clusters is an additional feature of Microsoft SQL Server 2019. You can query external data sources, store big data in HDFS managed by SQL Server, or query data from multiple external data sources using the cluster.
For more information, see the Microsoft page SQL Server Big Data Clusters partners.
You can use SQL Server Big Data Clusters to deploy scalable clusters of SQL Server and Apache SparkTM and Hadoop Distributed File System (HDFS), as containers running on Kubernetes.
For an overview of Microsoft SQL Server 2019 Big Data Clusters, see Microsoft’s Introducing SQL Server Big Data Clusters and on GitHub, see Workshop: SQL Server Big Data Clusters - Architecture.
Deploying Kubernetes Platform on PowerFlex
For this test, PowerFlex 3.6.0 is built in a two-layer configuration with six Compute Only (CO) nodes and eight Storage Only (SO) nodes. We used PowerFlex Manager to automatically provision the PowerFlex cluster with CO nodes on VMware vSphere 7.0 U2, and SO nodes with Red Hat Enterprise Linux 8.2.
The following figure shows the logical architecture of SQL Server BDC on Kubernetes platform with PowerFlex.
Figure 1: Logical architecture of SQL BDC on PowerFlex
From the storage perspective, we created a single protection domain from eight PowerFlex nodes for SQL BDC. Then we created a single storage pool using all the SSDs installed in each node that is a member of the protection domain.
After we deployed the PowerFlex cluster, we created eleven virtual machines on the six identical CO nodes with Ubuntu 20.04 on them, as shown in the following table.
Table 1: Virtual machines for CO nodes
2 x Intel Gold 6242 R, 20 cores
2 x Intel Gold 6242R, 20 cores
2 x Intel Gold 6242R, 20 cores
2 x Intel Gold 6242R, 20 cores
2 x Intel Gold 6242R, 20 cores
2 x Intel Gold 6242R, 20 cores
We manually installed the SDC component of PowerFlex on the worker nodes of Kubernetes. We then configured a Kubernetes cluster (v 1.20) on the virtual machines with three master nodes and eight worker nodes:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8m1 Ready control-plane,master 10d v1.20.10
k8m2 Ready control-plane,master 10d v1.20.10
k8m3 Ready control-plane,master 10d v1.20.10
k8w1 Ready <none> 10d v1.20.10
k8w2 Ready <none> 10d v1.20.10
k8w3 Ready <none> 10d v1.20.10
k8w4 Ready <none> 10d v1.20.10
k8w5 Ready <none> 10d v1.20.10
k8w6 Ready <none> 10d v1.20.10
Dell EMC storage solutions provide CSI plugins that allow customers to deliver persistent storage for container-based applications at scale. The combination of the Kubernetes orchestration system and the Dell EMC PowerFlex CSI plugin enables easy provisioning of containers and persistent storage.
In the solution, after we installed the Kubernetes cluster, CSI 2.0 was provisioned to enable persistent volumes for SQL BDC workload.
For more information about PowerFlex CSI supported features see Dell CSI Driver Documentation.
For more information about PowerFlex CSI installation using Helm charts, see PowerFlex CSI Documentation.
Deploying Microsoft SQL Server BDC on Kubernetes Platform
When the Kubernetes cluster with CSI is ready, Azure data CLI is installed on the client machine. To create base configuration files for deployment, see deploying Big Data Clusters on Kubernetes . For this solution, we used kubeadm-dev-test as the source for the configuration template.
Initially, using kubectl, each node is labelled to ensure that the pods start on the correct node:
$ kubectl label node k8w1 mssql-cluster=bdc mssql-resource=bdc-master --overwrite=true
$ kubectl label node k8w2 mssql-cluster=bdc mssql-resource=bdc-compute-pool --overwrite=true
$ kubectl label node k8w3 mssql-cluster=bdc mssql-resource=bdc-compute-pool --overwrite=true
$ kubectl label node k8w4 mssql-cluster=bdc mssql-resource=bdc-compute-pool --overwrite=true
$ kubectl label node k8w5 mssql-cluster=bdc mssql-resource=bdc-compute-pool --overwrite=true
$ kubectl label node k8w6 mssql-cluster=bdc mssql-resource=bdc-compute-pool --overwrite=true
To accelerate the deployment of BDC, we recommend that you use an offline installation method from a local private registry. While this means some extra work in creating and configuring a registry, it eliminates the network load of every BDC host pulling container images from the Microsoft repository. Instead, they are pulled once. On the host that acts as a private registry, install Docker and enable the Docker repository.
The BDC configuration is modified from the default settings to use cluster resources and address the workload requirements. For complete instructions about modifying these settings, see Customize deployments section in the Microsoft BDC website. To scale out the BDC resource pools, the number of replicas are adjusted to use the resources of the cluster.
The values shown in the following table are adjusted in the bdc.json file.
Table 2: Cluster resources
Apache Knox Gateway
Spark service resource configuration
Keeps track of nodes within the cluster
The configuration values for running Spark and Apache Hadoop YARN are also adjusted to the compute resources available per node. In this configuration, sizing is based on 768 GB of RAM and 72 virtual CPU cores available per PowerFlex CO node. Most of this configuration is estimated and adjusted based on the workload. In this scenario, we assumed that the worker nodes were dedicated to running Spark workloads. If the worker nodes are performing other operations or other workloads, you may need to adjust these values. You can also override Spark values as job parameters.
For further guidance about configuration settings for Apache Spark and Apache Hadoop in Big Data Clusters, see Configure Apache Spark & Apache Hadoop in the SQL Server BDC documentation section.
The following table highlights the spark settings that are used on the SQL Server BDC cluster.
Table 3: Spark settings
The SQL Server BDC 2019 CU12 release notes state that Kubernetes API 1.20 is supported. Therefore, for this test, the image that was deployed on the SQL master pod was 2019-CU12-ubuntu-16.04. A storage of 20 TB was provisioned for SQL master pod, with 10 TB as log space:
Because the test involved running the TPC-DS workload, we provisioned a total of 60 TB of space for five storage pods:
Validating SQL Server BDC on PowerFlex
To validate the configuration of the Big Data Cluster that is running on PowerFlex and to test its scalability, we ran the TPC-DS workload on the cluster using the Databricks® TPC-DS Spark SQL kit. The toolkit allows you to submit an entire TPC-DS workload as a Spark job that generates the test dataset and runs a series of analytics queries across it. Because this workload runs entirely inside the storage pool of the SQL Server Big Data Cluster, the environment was scaled to run the recommended maximum of five storage pods.
We assigned one storage pod to each worker node in the Kubernetes environment as shown in the following figure.
Figure 2: Pod placement across worker nodes
In this solution, Spark SQL TPC-DS workload is adopted to simulate a database environment that models several applicable aspects of a decision support system, including queries and data maintenance. Characterized by high CPU and I/O load, a decision support workload places a load on the SQL Server BDC cluster configuration to extract maximum operational efficiencies in areas of CPU, memory, and I/O utilization. The standard result is measured by the query response time and the query throughput.
A Spark JAR file is uploaded into a specified directory in HDFS, for example, /tpcds. The spark-submit is done by CURL, which uses the Livy server that is part of Microsoft SQL Server Big Data Cluster.
Using the Databricks TPC-DS Spark SQL kit, the workload is run as Spark jobs for the 1 TB, 5 TB, 10 TB, and 30 TB workloads. For each workload, only the size of the dataset is changed.
The parameters used for each job are specified in the following table.
Table 4: Job parameters
We set the TPC-DS dataset with the different scale factors in the CURL command. The data was populated directly into the HDFS storage pool of the SQL Server Big Data Cluster.
The following figure shows the time that is consumed for data generation of different scale factor settings. The data generation time also includes the post data analysis process that calculates the table statistics.
Figure 3: TPC-DS data generation
After the load we ran the TPC-DS workload to validate the Spark SQL performance and scalability with 99 predefined user queries. The queries are characterized with different user patterns.
The following figure shows the performance and scalability test results. The results demonstrate that running Microsoft SQL Server Big Data Cluster on PowerFlex has linear scalability for different datasets. This shows the ability of PowerFlex to provide a consistent and predictable performance for different types of Spark SQL workloads.
Figure 4: TPC-DS test results
A Grafana dashboard instance that is captured during the 30 TB run of TPC-DS test is shown in the following figure. The figure shows that the read bandwidth of 15 GB/s is achieved during the tests.
Figure 5: Grafana dashboard
In this minimal lab hardware, there were no storage bottlenecks for the TPC-DS data load and query execution. The CPU on the worker nodes reached close to 90 percent indicating that more powerful nodes could enhance the performance.
Running SQL Server Big Data Clusters on PowerFlex is a straightforward way to get started with modernized big data workloads running on Kubernetes. This solution allows you to run modern containerized workloads using the existing IT infrastructure and processes. Big Data Clusters allows Big Data scientists to innovate and build with the agility of Kubernetes, while IT administrators manage the secure workloads in their familiar vSphere environment.
In this solution, Microsoft SQL Server Big Data Clusters are deployed on PowerFlex which provides the simplified operation of servicing cloud native workloads and can scale without compromise. IT administrators can implement policies for namespaces and manage access and quota allocation for application focused management. Application-focused management helps you build a developer-ready infrastructure with enterprise-grade Kubernetes, which provides advanced governance, reliability, and security.
Microsoft SQL Server Big Data Clusters are also used with Spark SQL TPC-DS workloads with the optimized parameters. The test results show that Microsoft SQL Server Big Data Clusters deployed in a PowerFlex environment can provide a strong analytics platform for Big Data solutions in addition to data warehousing type operations.
If you want to discover more, contact your Dell representative.