The latest news about PowerFlex releases and updates
PowerFlex Summer 2021 Updates Deliver on Execution, Compliance, and Confidence
Wed, 23 Jun 2021 15:57:13 -0000|
Read Time: 0 minutes
PowerFlex Summer 2021 Updates Deliver on Execution, Compliance, and Confidence
The summer 2021 release of Dell EMC PowerFlex Software-defined Infrastructure extends the PowerFlex family’s transformational superpowers, providing businesses with the agility to thrive in ever-changing economic and technological landscapes. The release of PowerFlex 3.6 and PowerFlex Manager 3.7 enables customers to supercharge their mission-critical workloads with enhanced automation and platform options. It safeguards workload execution with expanded continuity and compliance offerings. And businesses running PowerFlex can be confident in predictable outcomes at scale with new infrastructure insights, network resiliency enhancements, and integrated upgrade guidance.
Keep an eye on the important stuff
A highlight of this release is PowerFlex integration with Dell EMC CloudIQ, a no cost cloud-based application that intelligently and proactively monitors the health of Dell EMC storage systems. Users can enjoy multi-system monitoring of system health, configuration/inventory, capacity usage, and historical performance from anywhere. The customer’s PowerFlex system must be first connected to Dell EMC Secure Remote Services (SRS), and then CloudIQ is automatically enabled. A health score is created, based on PowerFlex alerts, and the history of generated and cleared alerts is maintained for two years. After ingesting a couple of weeks of data, CloudIQ machine learning will begin looking for and noting IOPS and bandwidth anomalies. It also watches for and signals latency performance impacts.
Be safe with your data out there
PowerFlex native asynchronous replication was introduced last year with version 3.5. Now, in PowerFlex 3.6, we have made it even more flexible and improved compliance targets. We cut the minimum RPO in half and now support RPOs as low as 15 seconds. We also added tooling to improve control over Replication Consistency Groups (RCGs) – sets of PowerFlex volumes being replicated together. RCGs can now be active or inactive, where inactive RCGs hold their configuration but use no additional system resources. The ability to terminate an RCG and leave it in an inactive state also improves the recovery process if a user runs out of journal capacity.
With this release, PowerFlex supports replication in VMware HCI environments. In this scenario, PowerFlex Manager 3.7 (and above) orchestrates resizing the Storage Virtual Machines (SVMs) and the addition of the Storage Data Replicators (SDRs) into the system. Because the orchestration is done by PowerFlex Manager, the option to replicate between PowerFlex HCI deployments running VMware is limited to appliance and rack deployments.
Systems running 3.5.x can be active replication peers with systems running 3.6, and the source and destination systems can be on different code versions long term. For further information about PowerFlex replication architecture, limitations and design considerations, see the Dell EMC PowerFlex: Introduction to Replication white paper.
Along with these internal replication improvements, we are introducing integration with VMware Site Recovery Manager (SRM) – disaster recovery management and automation software for virtual machines and their workloads. The PowerFlex Storage Replication Adapter (SRA) enables PowerFlex as the native replication engine for protecting VMs on vSphere datastores. The PowerFlex SRA is compatible with SRM 8.2 or 8.3, the Photon OS-based SRM appliances. And while we are introducing this with the current releases, the SRA is compatible with PowerFlex systems running 3.5.1.x and above. Users can create recovery plans to failover VMs to another site, fail back to the original, or use PowerFlex’s non-disruptive replication failover testing to run failover tests in SRM.
The SRA and installation instructions are available for download from the VMware website. See the VMware Compatibility Guide to determine PowerFlex SRA compatibility with SRM versions. For detailed information about the SRA implementation and usage examples, see the whitepaper on Disaster Recovery for Virtualized Workloads Dell EMC PowerFlex using VMware Site Recovery Manager.
Figure – Architecture Overview of PowerFlex SRA and VMware SRM
PowerFlex native replication, and the integration with VMware Site Recovery Manager, provide robust, crash-consistent data protection for disaster recovery and business continuity. But we are also introducing integration with Dell EMC App
AppSync with PowerFlex integration will be available mid-July 2021. For information and examples, see the Dell EMC PowerFlex and AppSync integration video.
One more note on security. PowerFlex rack and appliance are now FIPS 140-2 compliant for data at rest and key management. Hardware based data at rest encryption is achieved using supported self-encrypting drives (SEDs), with the encryption engine running on the SEDs to deliver better performance and security. The SEDs based encryption claim is based on FIPS 140-2 Level 2 certification. Dell EMC CloudLink, the KMIP and FIPS 140-2 Level 1 (CloudLink Agent and CloudLink Server) compliant key manager, is used to manage SEDs encryption keys.
Automate (and containerize) all the things
PowerFlex software-defined infrastructure is eminently suited to cloud-native use cases and automatable workflows. There has been a lot of recent progress in PowerFlex’s support for these ecosystems. The Container Storage Interface (CSI) driver for PowerFlex continues to evolve, with support for accessing multiple PowerFlex clusters, ephemeral inline volumes, and importantly a containerized PowerFlex Storage Data Client (SDC) deployment and management. The containerized SDC allows CSI to inject the PowerFlex volume driver into the kernel of container-optimized operating systems that lack package managers. This provides PowerFlex CSI support for Red Hat CoreOS and Fedora Core OS. And it also enables integration of PowerFlex with RedHat OpenShift 4.6 and greater. The forthcoming CSI version 1.5 adds support for volume consistency groups and custom file system format options. Users can set specific disk format command parameters when provisioning a volume. Star and watch the GitHub Repository for the PowerFlex CSI Driver for updates.
In addition to this, Dell Technologies has been developing a set of Container Storage Modules (CSM) that complement the CSI drivers. PowerFlex is at the forefront of that effort, and there are several modules available for tech preview, with general availability coming later this year.
Users can automate volume and snapshot lifecycle management with the PowerFlex Ansible Modules. They can also use the modules to gather facts about their PowerFlex systems and manage various storage pool and SDC details. The Ansible modules are available on GitHub and Ansible Galaxy. They work with Ansible 2.9 or later and require the PowerFlex Python SDK (which may also be used my itself to facilitate authentication to and interaction with a PowerFlex cluster). Again, watch the repositories for additional modules and expansions in the near future.
All these automation tools leverage and rely upon the PowerFlex REST API. And Dell Technologies has introduced a new Developer Portal, where the APIs for many products can be explored. The PowerFlex API, along with explanations and usage examples, can be found at https://developer.dell.com/apis/4008/versions/3.6/docs.
Always keep on improving
With every release, PowerFlex and PowerFlex Manager get faster, more secure, and more easily manageable. In PowerFlex 3.6 there are a number of UI enhancements, including simplification of menus, better capacity reporting around data reduction, a new dedicated area for snapshots and snapshot policy management, and – following on Dell Technologies’ drive towards more inclusive language – a change in the labels for the MDM cluster roles. “Master” and “Slave” roles are now “Primary” and “Secondary”.
PowerFlex 3.6 introduces support for Oracle Linux Virtualization (KVM based), which adds a supported hypervisor layer to the previous support for Oracle Enterprise Linux. This advances the numerous Oracle database deployments on PowerFlex, providing improved Oracle supportability while still offering the great cost-effectiveness PowerFlex offers for running Oracle. For detailed information on installing and configuring, please refer to the white paper.
In the software-defined storage layer itself, version 3.6 doubles the number of Storage Data Clients (the consumers of PowerFlex volumes) per system to 2048. This doubles the number of hosts that can map volumes from PowerFlex storage pools. The software is also smarter when it comes to detecting and handling network error cases. In some disaggregated, or two-layer, systems where the SDCs live on a separate network than the storage cluster itself, a network path impairment between an SDC and a single Storage Data Server (SDS) node can cause I/O failures – even when there isn’t a general network failure in the cluster. In version 3.6 if such a disruption occurs, the SDC can use another SDS in the system to proxy the I/O to its original destination. Users are alerted until the problem is cleared, but I/O continues uninterrupted.
Because of the highly distributed architecture of PowerFlex, ports or sockets experiencing frequent disconnects (flapping), can cause overall system performance issues. 3.6 detects this and proactively disqualifies the path, preventing general disruption across the system.
In version 3.5, we introduced Protected Maintenance Mode (PMM), a super-safe way to put a node into maintenance while nevertheless avoiding a lengthy data-rehydration process at the end. Now, PMM makes use of the highly parallel many-to-many rebalancing algorithm, as a node goes into maintenance. Depending on the amount of data stored on the node, this can still be a long process, and other things can change in the system as it’s happening. PowerFlex 3.6 adds an auto-abort feature, in which the system continually scans for hardware or capacity issues that would prevent the node from completely entering PMM. If any flags are raised, the system will abort the process and notify the user. More information on maintenance modes, and the new PMM auto-abort feature, can be found in this whitepaper.
PowerFlex Manager 3.7 has gotten much more powerful as well. Foremost among the improvements is a new Compatibility Management feature. This new feature helps customers automatically identify the recommended upgrade paths for both the PowerFlex Manager appliance itself and the system RCM/IC upgrade. Prior to this release, whenever a customer or Dell Professional Services wished to do an upgrade, it took a lot of effort and time to manually investigate the documentation and compatibility matrixes to understand all of the upgrade rules – what are the allowed upgrade paths, which PowerFlex Manager version works with which RCM/IC versions, etc.
The new Compatibility Management tools eliminate the work and assist users by automatically identifying recommended upgrade paths. To determine which paths are valid and which are not, PowerFlex Manager uses information that is provided in a compatibility matrix file. The compatibility matrix file maps all the known valid and invalid paths for all previous releases of the software. It breaks the possible upgrade paths down as:
PowerFlex Manager 3.7 also introduces support for vSphere 7.0 U2. Upgrading to this version requires a manual vCenter upgrade. But then PowerFlex Manager will take over and manage the ESXi clusters. PowerFlex Manager 3.7 supports VMware ESXi 7.0 Update 2 installation, upgrade, and expansion operations for both hyperconverged and compute-only services. Users can deploy new services, add existing services running VMware ESXi 7.0 U2, or expand existing services. PowerFlex Manager also supports upgrades of VMware ESXi clusters in hyperconverged or compute-only services. You can upgrade VMware ESXi clusters from version 6.5, 6.7, or 7.0 to VMware ESXi 7.0 Update 2.
When you deploy a new ESXi 7.0U2 service, PowerFlex Manager automatically deploys two service volumes and maps these volumes to two heartbeat datastores on shared storage. PowerFlex Manager also deploys three vSphere Cluster Services (vCLS) VMs for the cluster.
PowerFlex Manager introduces several other enhancements in this release. It now supports 32k volumes per Service, aligned with PowerFlex core software volume scalability. It has enhanced security for SMB/NFS. A user-specific account is now required to gain access to the SMB share. PowerFlex Manager also updates the NFS share configuration when a user upgrades or restores the virtual appliance. PowerFlex Manager has disabled support for the SMBv1 protocol. PowerFlex Manager now uses SMBv2 or SMBv3 to enhance security.
It has also expanded its management capabilities over the PowerFlex Presentation Server and Gateway services. Prior to this release, PowerFlex Manager could deploy a Presentation Server (which hosts the WebUI) but not upgrade it. Now, PowerFlex Manager 3.7 can both discover existing instances and upgrade Presentation Servers. Similarly, it has gained the ability to upgrade the OS for the Gateway (which hosts the REST API). Prior to this release, PowerFlex Manager could only upgrade the Gateway RPM package without upgrading and patching the OS of the Gateway. Now PowerFlex Manager 3.7 can do both.
But it’s not all about software
This release adds support for a broader array of NVIDIA GPUs. Next-gen NVIDIA acceleration cards are now available for customers looking to run specialized, high-performance computing and analytics applications - Quadro RTX 6000, Quadro RTX 8000, A40, and A100. And we also introduce a small form factor GPU that can be used in the 1U R640-based PowerFlex Nodes – the NVIDIA A100. The past year demonstrated the importance supporting remote workers with virtual desktops, and PowerFlex supports GPU implementations on Citrix and VMware VDI.
We now support the Dell PowerSwitch S5296F-ON for the PowerFlex appliance. The S5295 has 96x 10/25G SFP28 ports and 8x 100G QSFP28 ports. It can support high node counts in a single cabinet, if the high oversubscription ratio is acceptable. We also introduce support for the Cisco Nexus 93180YC-FX, for use as either an access or an aggregation switch, and the Cisco 9364C-GX, for use as either a leaf or a spine switch, with 64x 100G ports.
Virtualized network infrastructure continues to grow in capability and deployment share. NSX-T™ is VMware's software-defined-network infrastructure that addresses cross-cloud needs for VM-centric network services and security. The PowerFlex appliance now joins the PowerFlex rack, in supporting NSX-T Ready configurations. “NSX-T Ready” means that the hardware configuration meets NSX-T requirements. The customer will provide NSX-T software and deploy with assistance from VMware or Dell Professional Services. The enabling components are:
As with the PowerFlex rack, appliance NSX-T edge nodes are “service appliances” that are dedicated to run network services, while the newly available HA appliance management nodes run the NSX-T management VMs. PowerFlex Manager can assist in deploying the edge nodes and will lifecycle the hardware aspects.
Wrap it up
Thanks for taking time to read about what’s new with Dell EMC PowerFlex software-defined infrastructure. We haven’t even been able to cover all the great new things being introduced this summer. Supercharge your mission-critical workloads flawlessly with enhanced automation, effortlessly enable business continuity and compliance, and confidently manage your data center operations at scale. To continue exploring, visit us on the Dell Technologies website for PowerFlex.
Copy data management with AppSync for applications running on Dell EMC PowerFlex
Mon, 21 Jun 2021 10:42:34 -0000|
Read Time: 0 minutes
Dell EMC PowerFlex, a premier software-defined platform for your mission-critical workloads, empowers organizations to move faster and respond effectively to rapidly changing business needs. PowerFlex provides unprecedented freedom to deploy and scale critical workloads that drive your business, while ensuring exceptional performance, simplicity, and manageability.
As the world of DevOps evolves, agility in IT operations is critical in order to rapidly provision environments for test and development. This agility requires a platform which includes integrated copy data management for DevOps environments.
The PowerFlex software-defined storage solution enables this transformational agility for organizations looking to modernize their DevOps application development operation and empowers organizations to move faster and respond more effectively to rapidly changing business needs.
Dell EMC AppSync for PowerFlex provides a single user interface that simplifies, orchestrates and automates the process of generating and consuming DevOps data across all enterprise database applications deployed on PowerFlex.
AppSync for PowerFlex provides simple automated copy creation and consumption, eliminating manual steps or custom scripts. AppSync integrates tightly with host environments and database applications including, but not limited to, Oracle and SQL Server. With AppSync, applications owners, database administrators, and storage administrators get on - and stay on - the same page through a transparent copy workflow.
Dell EMC AppSync for PowerFlex allows you to protect, restore and repurpose application data, satisfying any DevOps requirements.
AppSync version 4.3 enables support for the PowerFlex family - rack, appliance and ready node consumption options.
The architecture of AppSync has three major components:
Registering the PowerFlex system with AppSync
AppSync interacts with the PowerFlex system by communicating with PowerFlex Gateway using API calls:
1. On the AppSync console, select Settings > Infrastructure Resources > STORAGE SYSTEMS. Click ADD SYSTEMS.
2. Under Select System Type, choose PowerFlex.
3. Enter the PowerFlex Gateway IP and credentials to configure the storage system.
4. Review the configurations in the Summary page and click FINISH to register the PowerFlex system.
AppSync service plans
AppSync provides intuitive workflows to set up protection and repurposing jobs (called Service Plans) that provide end-to-end automation of all the steps from application discovery and storage mapping to mounting copies to the target hosts. Service plans can be scheduled with alert emails to easily track their status. AppSync also provides an application protection monitoring and reporting service that generates alerts if SLAs are not met or if a service plan fails.
AppSync supports three types of service plans:
AppSync enables application owners and DBAs to protect, restore. and repurpose their data to satisfy their unique copy requirements. This accelerates and improves processes like test and dev by providing the latest production data for high quality product releases. AppSync’s support for second generation copies (a copy of a copy) allows for required data masking, filtering and obfuscation by DBAs so that end-users of data have access to only the data that they need. At any given point of time, storage admins can get a complete picture of the copy landscape so that they are aware of capacity utilization and the scope for optimization.
AppSync allows you to create copies of your database and file systems for application testing and validation, test and development, reporting, data masking, and data analytics. AppSync identifies copies that are created from a repurpose action as first-generation and second-generation copies. The source of a second-generation copy is a first-generation copy. You can create multiple second-generation copies from a first-generation copy.
AppSync support for PowerFlex
AppSync integration with PowerFlex videos
AppSync integration enables PowerFlex users to protect, restore and repurpose their data to satisfy their unique copy requirements for their enterprise applications users.
CloudLink 7.1: Simplifying datacenter security
Fri, 23 Apr 2021 12:10:59 -0000|
Read Time: 0 minutes
Are you feeling safe about the security of your data center’s infrastructure? Chances are, you aren’t. According to a recent poll1, 74% of customers report experiencing some form of cyber attack in the last twelve months, and 86% were concerned about potential cyberattacks. Clearly, data center security is a topic than can no longer be ignored - and most of our customers are taking steps to ensure their data is safe. Yet even though it’s necessary, adopting data center security can be confusing, complex, and difficult to implement.
Dell EMC CloudLink aides our customers in this effort by being reliable, flexible, and easy to use. Our 7.1 release adds new tools to our toolbox including shallow rekey for our Container based encryption, support for vVols encryption and IPv6 only environments, and the new Secure Configuration Summary page designed to make security audits of CloudLink a breeze.
Every security related framework published discusses the need for regular monitoring and assessment of implemented security controls to ensure that the products and deployment are meeting relevant industry standards. Such activities usually include the dreaded yearly security audit. Datacenter administrators view this effort with disfavor because it takes time out of their already busy schedule to walk through the deployment with the auditor to prove compliance.
In the past we’ve heard from our customers that the CloudLink GUI is easy enough to navigate that security audit reviews weren’t too painful, but they occasionally expressed that it would be nice to make them a little bit easier. Well we heard their requests loud and clear and have obliged with the Secure Configuration Summary. We’ve gathered the information commonly requested during security audits onto one page so when the security administrator and auditor go to CloudLink for a review, it’s a one stop shop.
With audits though, simply viewing configuration settings isn’t enough as most auditors require tangible proof to attach to their reports. Screen shots work but we offer something better – the ability to export the configuration settings provided on the summary page. As with most of our GUI pages, you can export the Secure Configuration Summary to a handy-dandy spreadsheet which can be presented directly to the auditor. A one click audit review – can it get any easier than that?
Of course, not all audits are the same and some requirements are more extensive than others. To accommodate this eventuality, our summary page provides direct links to the configuration pages for each setting. If an auditor needs more information on a particular configuration, simply jump to the relevant page, review, and download an export if needed.
Encryption is hard and it can be a challenge to understand, implement, and maintain. We understand that most of our customers are not in the datacenter security business. CloudLink strives to make data encryption in the datacenter a simple, set it and forget it task, so that our customers can focus on their core business, not on trying to figure out how to keep their data safe – that’s our job.
If you would like to know more about CloudLink and our latest release please visit our website and reach out to your Dell Technologies sales team to ask how we can make data encryption easy for you too.
1 Source: statista.com
Deploying Tanzu Application Services on Dell EMC PowerFlex
Tue, 15 Dec 2020 14:35:58 -0000|
Read Time: 0 minutes
Tanzu Application Service (TAS) architecture provides the best approach available today to enable agility at scale with the reliability that is must to address these challenges. PowerFlex family offers key value propositions of traditional and cloud-native production workloads, deployment flexibility, linear scalability, predictable high performance, and enterprise-grade resilience.
The VMware Tanzu Application Service (TAS) is based on Cloud Foundry –an open-source cloud application platform that provides a choice of clouds, developer frameworks, and application services. Cloud Foundry is a multi-cloud platform for the deployment, management, and continuous delivery of applications, containers, and functions. TAS abstracts away the process of setting up and managing an application runtime environment so that developers can focus solely on their applications and associated data. Running a single command—cf push—creates a scalable environment for your application in seconds, which might otherwise take hours to spin up manually. TAS allows developers to deploy and deliver software quickly, without the need of managing the underlying infrastructure.
PowerFlex (previously VxFlex OS) is the software foundation of PowerFlex software-defined storage. It is a unified compute, storage and networking solution delivering scale-out block storage service designed to deliver flexibility, elasticity, and simplicity with predictable high performance and resiliency at scale.
The PowerFlex platform is available in multiple consumption options to help customers meet their project and data center requirements. PowerFlex appliance and PowerFlex rack provide customers comprehensive IT Operations Management (ITOM) and life cycle management (LCM) of the entire infrastructure stack in addition to sophisticated high-performance, scalable, resilient storage services. PowerFlex appliance and PowerFlex rack are the two preferred and proactively marketed consumption options. PowerFlex is also available on VxFlex Ready Nodes for those customers interested in software-defined compliant hardware without the ITOM and LCM capabilities.
PowerFlex software-define storage with unified compute and networking offers flexibility of deployment architecture to help best meet the specific deployment and architectural requirements. PowerFlex can be deployed in a two-layer for asymmetrical scaling of compute and storage for “right-sizing capacities, single-layer (HCI), or in mixed architecture.
For this example, a PowerFlex production cluster is set up using a Hyperconverged configuration. The production cluster has connectivity to the customer-data network and the private backend PowerFlex storage network. The PowerFlex production cluster consists of a minimum of four servers that host the workload and PowerFlex storage VMs. All the nodes are part of a single ESXi Cluster and part of the same PowerFlex Cluster. Each node contributes all their internal disk resources to PowerFlex cluster.
The PowerFlex management software manages the capacity of all of the disks and acts as a back-end for data access by presenting storage volumes to be consumed by the applications running on the nodes. PowerFlex Manager also provides the essential operational controls and lifecycle management tools. The production cluster hosts the compute nodes that are used for deployment of TAS VMs. TAS components are deployed across three dedicated compute clusters that are designated as three availability zones. These compute clusters are managed by the same 'compute workload' vCenter as the dedicated Edge cluster. The following figure depicts the layout in the lab environment:
Figure 1. PowerFlex production cluster
The compute infrastructure illustrates the best practice architecture using 3 AZ’s using PowerFlex rack in hyperconverged configured nodes. This design ensures the high availability of nodes (i.e., nodes in AZ1 will still function if AZ2 or AZ3 goes down). A dedicated compute cluster in each AZ’s combines to form Isolation Zone (IZ). These AZ’s can be used to deploy and run the TAS stateful workloads requiring persistent storage. On the PowerFlex storage we have created volumes in the backend which are being mapped to vSphere as Datastores.
PowerFlex storage distributed data layout scheme is designed to maximize protection and optimize performance. A single volume is divided into chunks. These chunks will be distributed (striped) on physical disks throughout the cluster, in a balanced and random manner. Each chunk has a total of two copies for redundancy.
PowerFlex can be feature configured optionally to achieve additional data redundancy by enabling the feature Fault sets. Persistent Storage for each AZ could be its own PowerFlex cluster. By implementing PowerFlex feature Fault sets we can ensure that the persistent data availability all time. Fault Sets are subgroup of SDS s (Software defined Storage) installed on host servers within a Protection Domain. PowerFlex OS will mirror data for a Fault Set on SDSs that are outside the Fault Set. Thus, availability is assured even if all the servers within one Fault Set fail simultaneously.
PowerFlex enables flexible scale out capabilities for your data center also provides unparalleled elasticity and scalability. Start with a small environment for your proof of concept or a new application and add nodes as needed when requirements evolve.
The solution mentioned in this blog provides recommendations for deploying a highly available and production-ready Tanzu Application Service on Dell EMC PowerFlex rack infrastructure platform to meet the performance, scalability, resiliency, and availability requirements and describes its hardware and software components. For complete information, see Tanzu Application Services on PowerFlex rack - Solution Guide.
Introducing the PowerFlex Management Pack for vRealize Operations
Mon, 02 Nov 2020 13:09:42 -0000|
Read Time: 0 minutes
Achieving operation efficiency in today’s modern cloud infrastructure brings automation to the forefront. Centralized visibility provides a key piece of the insight needed to understand if there are operational inefficiencies for taking actions that mitigate business disruption.
We are pleased to share the general availability of Dell EMC PowerFlex Management Pack for vRealize Operations 8.x. The PowerFlex MP for vROps extends the visibility of PowerFlex systems into vROps where IT can monitor their complete data center and cloud operations. It is available to all PowerFlex rack and appliance customers at no additional cost. This brings additional value to the comprehensive IT operations management functionality delivered by PowerFlex Manager that enables full life cycle management of the unified compute and software defined storage solution.
The management pack queries and collects key PowerFlex metrics for storage, compute, networking, and server hardware using APIs and ingests into vROps that can be visualized using the out-of-the-box dashboards. It also provides a detailed system level view that shows the health status and relationship between different components of the PowerFlex system.
Dashboards: The management pack includes 13 default dashboards showing details of PowerFlex storage, PowerFlex Manager, PowerFlex nodes, network switches, ESXi hosts, and clusters. These configurable dashboards provide user customizable data displays that adjust to meet a wide variety of requirements.
Predefined symptoms and alert definitions: The management pack includes 166 symptom definitions and 152 alert definitions based on engineering best practices for the PowerFlex systems. Symptoms and alerts can be customized by the user to meet the demand of their environment.
Historical data: This is available for all PowerFlex Adapter resource kinds. This data provides a view of consumption over time and includes capacity forecasting based on usage for PowerFlex storage.
Network topology and relationship: The topology tree functionality available in vROps is extremely useful when mapping relationships between nodes, network interfaces, switch port, VLAN, port-channel, and vPC.
Detailed metric collection: In addition to the default dashboards, users have the option of drilling into specific metrics for nearly all available data from the components of PowerFlex system, even if it is not included in a dashboard.
Multiple PowerFlex systems awareness: Ability to group and differentiate multiple PowerFlex systems.
PowerFlex node type differentiation: Ability to identify and classify compute, storage, hyperconverged, and management controller nodes.
PowerFlex Details: This dashboard shows all the PowerFlex storage KPIs with historical data providing a view of storage performance utilization over time.
PowerFlex Node Summary: You can monitor the health status of all your PowerFlex nodes and its hardware components in this dashboard.
PowerFlex Networking Performance: This dashboard shows network KPIs like throughput, errors, packet discards with historical data providing a view of network utilization over time.
For customers who have already invested in vRealize Operations, this management pack is a great value add to monitor their PowerFlex systems. It is an end-to-end monitoring and alerting solution for PowerFlex infrastructure using vROps. It helps customers significantly in terms of capacity planning based on the historical data of resource consumption over time. It also helps to identify usage trends and provides insight to understand if there are operational issues/ inefficiencies for taking necessary actions to avoid service outages and mitigate business disruption. This integration with VMware vRealize Operations reduces operational complexity by using a unified platform to monitor and manage private data center infrastructure, as well as hybrid and multi-cloud environments.
Demystifying CSI plug-in for PowerFlex (persistent volumes) with Red Hat OpenShift
Wed, 14 Oct 2020 18:12:01 -0000|
Read Time: 0 minutes
The Container Storage Interface (CSI) is a standard for exposing file and block storage to containerized workloads on Kubernetes, OpenShift and so on. CSI helps third-party storage providers (for example PowerFlex) to write plugins for OpenShift to consume storage from backends as persistent storage.
CSI driver for Dell EMC VxFlex OS can be installed using Dell EMC Storage CSI Operator. It is a community operator and can be deployed using OperatorHub.io.
Master nodes components do not communicate directly with CSI driver. It interacts only with API server on Master nodes. It MUST watch the Kubernetes API and trigger the appropriate CSI operations against it. Kubelet discovers CSI drivers using kubelet plug-in registration mechanism. It directly issues calls to CSI driver.
External Provisioner –The CSI external provisioner is a sidecar container that watches the k8s API server for PersistentVolumeClaim objects. It calls CreateVolume against the specified CSI endpoint to provision a volume.
External Attacher – The CSI external attacher is a sidecar container that watches the API server for VolumeAttachment objects and triggers controller [Publish|Unpublish] volume operations against a CSI endpoint.
CSI Controller plug-in – The controller component can be deployed as a Deployment or StatefulSet on any node in the cluster. It consists of the CSI driver that implements the CSI Controller service.
CSI Identity – It enables k8s components and CSI containers to identify the driver.
CSI Node Plugin –The node component should be deployed on every node in the cluster through a DaemonSet. It consists of the CSI driver that implements the CSI Node service and the node driver registrar sidecar container.
Storage within OpenShift Container Platform 4.x is managed from worker nodes. The CSI API uses two new resources: PersistentVolume (PV) and PersistentVolumeClaim (PVC) objects.
Persistent Volumes – Kubernetes provides physical storage devices to the cluster in the form of objects called Persistent Volumes.
Persistent Volume Claim – This object lets pods use storage from Persistent Volumes.
Storage Class – This object helps you create PV/PVC pair for pods. It stores information about creating a persistent volume.
- name: powerflexos
- key: csi-vxflexos.dellemc.com/X_CSI_VXFLEXOS_SYSTEMNAME
- name: powerflex-xfs
- key: csi-vxflexos.dellemc.com/X_CSI_VXFLEXOS_SYSTEMNAME
Static Provisioning – This allows you to manually make existing PowerFlex storage available to the cluster.
Dynamic Provisioning - Storage volumes can be created on-demand. Storage resources are dynamically provisioned using the provisioner that is specified by the StorageClass object.
Retain Reclaiming – Once PersistentVolumeClaim is deleted, the corresponding PersistentVolume is not deleted rather moved to Released state and its data can be manually recovered.
Delete Reclaiming – It is the default reclaim policy and unlike Retain policy persistent volume is deleted.
Access Mode - ReadWriteOnce -- the volume can be mounted as read/write by a single node.
Supported FS - ext4/xfs.
Raw Block Volumes: Using Raw block option, PV can be attached to pod or app directly without formatting with ext4 or xfs file system.
A Case for Repatriating High-value Workloads with PowerFlex Software-Defined Storage
Wed, 26 Aug 2020 18:33:51 -0000|
Read Time: 0 minutes
Kent Stevens, Product Management, PowerFlex
Brian Dean, Senior Principal Engineer, TME, PowerFlex
Michael Richtberg, Chief Strategy Architect, PowerFlex
We observe customers repatriating key applications from the Cloud, help you think about where to run your key applications, and explain how PowerFlex’s unique architecture meets the demands of these workloads in running and transforming your business
For critical software applications you depend upon to power core business and operational processes, moving to “The Cloud” might seem the easiest way to gain the agility to transform the surrounding business processes. Yet we see many of our customers making the move back home, back “On-Prem” for these performance-sensitive critical workloads – or resisting the urge to move to The Cloud in the first place. PowerFlex is proving to deliver agility and ease of operations for the IT infrastructure for high-value, large-scale workloads and data-center consolidation, along with a predictable cost profile – as a Cloud-like environment enabling you to reach your business objectives safely within your own data center or at co-lo facilities.
IDC recently found that 80% of their customers had repatriation activities, and 50% of public-cloud based applications were targeted to move to hosted-private cloud or on-premises locations within two years(1). IDC notes that the main drivers for repatriation are security, performance, cost, and control. Findings reported by 451 Research(2) show cost and performance as the top disadvantages when comparing on-premises storage to cloud storage services. We’ve further observed that core business-critical applications are a significant part of these migration activities.
If you’ve heard the term “data gravity,” which relates to the difficulty in moving data to and from the cloud and that may only be part of the problem. “Application” gravity is likely a bigger problem for performance sensitive workloads that struggle to achieve the required business results because of scale and performance limitations of cloud storage services.
Transformation is the savior of your business – but a problem for your key business applications
Business transformation impacts the data-processing infrastructure in important ways: Applications that were stable and seldom touched are now the subject of massive changes on an ongoing basis. Revamped and intelligent business processes require new pieces of data, increasing the storage requirements and those smarts (the newly automated or augmented decision-making) require constant tuning and adjustments. This is not what you want for applications that power your most important business workflows that generate your profitability. You need maximum control and full purview over this environment to avoid unexpected disruptions. It’s a well-known dilemma that you must change the tires while the car is driving down the road – and today’s transformation projects can take this to the extreme.
The infrastructure used to host such high-profile applications – computing, storage and networking – must be operated at scale yet still be ready to grow and evolve. It must be resilient, remain available when hardware fails, and be able to transform without interruption to the business.
Does the public cloud deliver the results you expected?
Do your applications require certain minimum amounts of throughput? Are there latency thresholds you consider critical? Do you require large data capacities and the ability to scale as demands grow? Do require certain levels of availability? You may assume all these requirements come with a “storage” product offered by the public cloud platforms, but most fall short of meeting these needs. Some require over-provisioning to get better performance. High availability options may be lacking. The highest performing options have capacity scale limitations and can be prohibitively expensive. If you assume what you’ve been using on-prem comes from a hyperscaler, you may be quite surprised that there are substantial gaps that require expensive application rearchitecting to be “cloud native” which may become budget busters. These public cloud attributes can lead to “application gravity” gaps.
While the agility of it is tempting, the unexpected costliness of moving everything to the public cloud has turned back more than one company. When evaluating the economics and business justification for Cloud solutions, many costs associated with full-scale operations, spikes in demand or extended services can be hard to estimate, and can turn out to be large and unpredictable.
The full price of cloud adoption must account for the required levels of resiliency, management infrastructure, storage and analytics for operational data, security solutions, and scaling up the resources to realistic production levels. Recognizing all the necessary services and scale may undermine what might have initially appeared to be a solid cost justification. Once the budget is established, active effort and attention must be devoted to monitoring and oversight. Adapting to unexpected operational events, such as bursting or autoscaling for temporary spikes in workload or traffic, can bring unforeseen leaps in the monthly bill. Such situations can be especially hard to predict and plan for – and very difficult to control.
You want the speed, convenience and elasticity of running in the cloud - but how do you ensure that agility while staying within the necessary bounds of cost and oversight? Truly transformative infrastructure allows businesses to consolidate compute and storage for disparate workloads onto a single unified infrastructure to simplify their environment, increase agility, improve resiliency and lower operational costs. And your potential payoff is big with far easier scaling, more efficient hardware utilization, and less time spent figuring out how to get things right or tracking down issues that complicate disparate system architectures.
Software-Defined is the Future
IDC Predicts that by 2024, software-defined infrastructure solutions will account for 30% of storage solutions(3). At the heart of the PowerFlex family, and the enabler of its flexibility, scale and performance is PowerFlex software-defined storage. The ease and reliability of deployment and operation is provided by PowerFlex Manager, an IT operations and lifecycle management tool for full visibility and control over the PowerFlex infrastructure solutions.
PowerFlex’s unmatched combination of flexibility, elasticity, and simplicity with predictable high performance - at any scale - makes it ideally suited to be the common infrastructure for any company. Utilizing software defined storage (SDS) and hosting multiple heterogeneous computing environments, PowerFlex enables growth, consolidation, and change with cloud-like elasticity – without barriers that could impede your business.
The resulting unique architecture of the PowerFlex family easily meets the large-scale, always-on requirements of our customers’ core enterprise applications. The power and resiliency of the PowerFlex infrastructure platforms handle everything from high-performance enterprise databases, to web-scale transaction processing, to demanding business solutions in various industries including healthcare, utilities and energy. And this includes the new big-data and analytical workloads that are quickly augmenting the core applications as the business processes are being transformed.
PowerFlex: A Unique Platform for Operating and Transforming Critical Applications
PowerFlex provides the flexibility to utilize your choice of tools and solutions to drive your transformation and consolidation, while controlling the costs of the relentless expansion in data processing. PowerFlex provides the modularity to adapt and grow efficiently while providing the manageability to simplify your operations and reduce costs. It provides the scalable infrastructure on-premises to allow you focus on your business operations. PowerFlex on-demand options by the end of 2020 enable an elastic OPEX consumption model as well.
As your business needs change, PowerFlex provides a non-disruptive path of adaptability. As you need more compute, storage or application workloads, PowerFlex modularly expands without complex data migration services. As your application infrastructure needs change from virtualization to containers and bare metal, PowerFlex can mix and match these in any combination necessary without needing physical changes or cluster segmentation. PowerFlex provides future-proof capabilities that keep up with your demands with six nines of availability and linear scalability.
With the dynamic new pace of growth and change, PowerFlex can ensure you stay in charge while enabling the agility to adapt efficiently. PowerFlex enables you to leverage the advantages of oversight and cost-effectiveness of the on-premises environment with the ability to meet transformation head-on.
1 IDC Cloud Repatriation Accelerates in a Multi-Cloud World, July 2018
2 451 Research, 2020 Voice of the Enterprise
3 IDC FutureScape: Worldwide Enterprise Infrastructure 2020 Predictions, October 2019
PowerFlex Native Asynchronous Replication RPO with Oracle
Mon, 17 Aug 2020 15:52:45 -0000|
Read Time: 0 minutes
PowerFlex software-defined storage platform provides a reliable, high-performance foundation for mission-critical applications like Oracle databases. In many of these deployments, replication and disaster recovery have become a common practice for protecting critical data and ensuring application uptime. In this blog, I will be discussing strategies for replicating mission-critical Oracle databases using Dell EMC PowerFlex software-defined storage.
Customers require Disaster Recovery and Replication capabilities to meet mission-critical business requirements where SLAs require the highest uptime. Customers also want the ability to quickly recover from physical or logical disasters to ensure business continuity in the event of disaster and be able to bring up the applications in minimal time without impact to data. Replication means that the same data is available at multiple locations. For Oracle database environments, it is important to have local and remote replicas of application data which are suitable for testing, development, reporting, and disaster recovery and many other operations. Replication improves the performance and protects the availability of Oracle database application because the data exists in another location. Advantages of having multiple copies of data being present across geographies is that, critical business applications will continue to function if the local Oracle database server experiences a failure.
Replication enables customers in various scenarios such as:
PowerFlex is a software-defined storage platform designed to significantly reduce operational and infrastructure complexity, empowering organizations to move faster by delivering flexibility, elasticity, and simplicity with predictable performance and resiliency at scale. The PowerFlex family provides a foundation that combines compute as well as high performance storage resources in a managed unified fabric.
PowerFlex is designed to provide extreme performance and massive scalability up to 1000s of nodes. It can be deployed as a disaggregated storage / compute (two-layer), HCI (single-layer), or a mixed architecture. PowerFlex inclusively supports applications ranging from bare-metal workloads and virtualized machines to cloud-native containerized apps. It is widely used for large-scale mission-critical applications like Oracle database. For information about best practices for deploying Oracle RAC on PowerFlex, see Oracle RAC on PowerFlex rack.
PowerFlex also offers several enterprise-class native capabilities to protect critical data at various levels:
PowerFlex software consists of a few important components - Meta Data Manager (MDM), Storage Data Server (SDS), Storage Data Client (SDC) and Storage Data Replicator (SDR). MDM manages the PowerFlex system as a whole, which includes metadata, devices mapping, volumes, snapshots, system capacity, errors and failures, system rebuild and rebalance tasks. SDS is the software component that enables a node to contribute its local storage to the aggregated PowerFlex pool. SDC is a lightweight device driver that exposes PowerFlex volumes as block devices to the applications and hosts. SDR handles the replication activities. PowerFlex has a unique feature called Protection Domain. A Protection Domain is a logical entity that contains a group of SDSs. Each SDS belongs to only one Protection Domain.
Figure 1. PowerFlex asynchronous replication between two systems
Replication occurs between two PowerFlex systems designated as peer systems. These peer systems are connected using LAN or WAN and are physically separated for protection purposes. Replication is defined in scope of a protection domain. All objects which participate in replication are contained in the protection domain, including volumes in Replication Consistency Group (RCG). Journal capacity from storage pools in the protection domain is shared among RCGs in the protection domain.
The SDR handles replication activities and manages I/O of replicated logical volumes. The SDR is deployed on the same server as SDS. Only I/Os from replicated volumes flows through SDR.
Figure 2. PowerFlex replication I/O flow between two systems
For detailed information about Architecture Overview, see Dell EMC PowerFlex: Introduction to Replication White Paper.
It is important to note that this approach to replication allows PowerFlex to support replication at extreme scales. As the number of nodes contributing storage are scaled, so are the SDR instances. As a result, this replication mechanism can scale effortlessly from 4 to 1000s of nodes while delivering RPOs as low as 30 seconds and meeting IO and throughput requirements.
The following illustration demonstrates that the volumes participating in replication are grouped to form the Replication Consistency Group (RCG). RCG acts as the logical container for the volumes.
Figure 3. PowerFlex replication with Oracle database
Depending on the scenario, we can create multiple RCGs for each volume pair or combine multiple volume pairs in a single RCG.
In the above Oracle setup, PowerFlex System-1 is the source and PowerFlex System-2 is the destination. For replication to occur between the source and target, the following criteria must be met:
The PowerFlex replication is designed to recover from as low as a 30 seconds RPOs minimizing the data-loss if there is a disaster recovery. During creation of RCG, users can specify RPO starting from 30 seconds to maximum of 60 minutes.
All the operations performed on source will be replicated to destination within the RPO. To ensure RPO compliance, PowerFlex replicates at least twice for every RPO period. For example, setting RPO to 30 seconds means that PowerFlex can immediately return to operation at the target system with only 30 seconds of potential data loss.
The following figures depicts the replication scenario under steady state of workload:
Figure 4. 100% RPO compliance for RPO of 30s for an Oracle database during a steady application workload
Figure 5. Replication dashboard view of PowerFlex
In the case of disaster recovery, the entire application can be up and running by failover to secondary, with less than 30 seconds of data loss.
When we do a planned switchover or failover, the volumes on secondary system are automatically changed to read-write access mode and the volumes on source will be changed to read-only. Consequently, we can bring up Oracle database on secondary by setting up the Oracle environment variables and starting the database.
Once we have RCG in the failover or switchover mode, user can decide how to continue with replication:
PowerFlex also provides various other options:
PowerFlex native volume replication is a unique solution and provides customers with easy to configure and setup without worrying about disaster.
Irrespective of workload and application, it is designed to support massive scale while providing RPOs as low as 30 seconds.
For more information, please visit: DellTechnologies.com/PowerFlex.
Grace Under Pressure — PowerFlex Rebuild Superpowers
Tue, 04 Aug 2020 20:11:02 -0000|
Read Time: 0 minutes
The first blog in this series, “Resiliency Explained — Understanding PowerFlex's Self-Healing, Self-Balancing Architecture,” covered an overview of how the PowerFlex system architecture provides superior performance and reliability. Today, we’ll take you through another level of detail with specific examples of recoverability.
Warning: Information covered in this blog may leave you wanting for similar results from other vendors.
PowerFlex possess some incredible superpowers that deliver performance results that run some of the world’s most demanding applications. But what happens when you experience an unexpected failure like losing a drive, a node, or even a rack of servers? Even planned outages for maintenance can result in vulnerabilities or degraded performance levels, IF you use conventional data protection architectures like RAID.
Just a reminder, PowerFlex is a high-performance software defined storage system that delivers the compute and storage system in a unified fabric with the elasticity to scale either compute, storage or both to fit the workload. PowerFlex uses all-flash direct attached media located on standard x86 servers utilizing industry standard HBA adapters and 10 Gb/s or higher ethernet NICs that interconnect servers. The systems scale from 4 nodes to multi-rack 1000+ nodes while increasing capacity, linearly increasing IOPS, all while sustaining sub-millisecond latency.
PowerFlex takes care of dynamic data placement that ensures there are NO hot spots, so QoS is a fundamental design point and not an after-thought bolt-on “fix” for a poor data architecture scheme; there’s no data locality needed. PowerFlex handles the location of data to ensure there are no single points of failure, and it dynamically re-distributes blocks of data if you lose a drive, add a node, take a node off line, or have a server outage (planned or unplanned) containing a large number of drives. It automatically load balances the placement of data as storage use changes over time or with node expansion.
The patented software architecture underlying PowerFlex doesn’t use a conventional RAID protection mechanism. RAID serves a purpose, and even options like erasure coding have their place in data protection. What’s missing in these options? Let’s use a couple of analogies to compare traditional RAID and PowerFlex protection mechanisms:
Think of RAID as a multi-cup layout where you’re looking to ensure each write places data in multiple cups. If you lose a cup, you don’t necessarily re-arrange the data. You’re protected from data loss, but without the re-distribution, you’re still operating in a deprecated state and potentially vulnerable to additional failures until the hardware replacement occurs. If you want more than one level of cup failure, you have multiple writes to get multiple cups which creates more overhead (particularly in a software-defined storage versus a hardware RAID controller-based system). It still only takes care of data protection and not necessarily performance recovery.
Think of the architectural layout of data like a three-dimensional checkerboard where we ensure the data placement keeps your data safe. In the checkerboard layout, we can quickly re-arrange the checkers if you lose a box on the board or a row/column or even a complete board of checkers. Re-arranging the data to ensure there’s always two copies of the data for on-going protection and restoration of performance. The three-dimensional aspect comes from all nodes and all drives participating in the re-balancing process. The metadata management system seamlessly orchestrates re-distribution and balancing data placement.
Whether the system has a planned or unplanned outage or a node upgrade or replacement, this automatic rebalancing happens rapidly because every drive in the pool participates. The more nodes and the more drives, the faster the process of reconstituting any data rebuilding processes. In the software defined PowerFlex solution there’s no worrying about a RAID level or the performance trade-offs, it’s just taken care of for you seamlessly in the background without any of the annoying complications RAID often introduces or the need any specialized hardware controllers and associated cost.
PowerFlex looks at actual data stored on each drive rather than treating the whole drive capacity as what needs recovering. In this example, a drive failure occurs. The data levels illustrated here represent the total used capacity in these 6, 9 or 12 node configuration examples (we can scale to over 1,000 nodes). The 25%, 50% and 75% levels show relative rebuild times for this 960GB SAS SSD to return to restore the data to a full heathy state (re-protected).
We’re showing you a rebuild scenario to emphasize the performance, but taking it to another level, you wouldn’t be urgently needing to replace the drive as we leverage the data redistribution to other drives for protection and sustaining performance while using virtual spare space provided by all of the drives to pick up the gap. Unlike RAID, we don’t need to replace the drive to return the system to full health. You can replace the drive when it’s convenient.
Notice a few things:
This illustrates what happens when you have 35, 53, and 71 drives participating in the parallel rebuild process for the six, nine and twelve node configurations, respectively.
Node Rebuild (6 drives)
Here we show an example using a similar load level of data on the nodes. The nodes each contain six drives with a maximum of 5.76TB to be rebuilt. The entire cluster of drives participates in taking over the workloads, automatically rearranging the data placement and making sure the cluster always has two copies of the data residing on different nodes. Just as in the above drive rebuild example, the process leverages all the remaining drives from the cluster to take on the rebuild process to return to a fully protected state. That means for the six-node configuration there are 30 drives participating in the parallelized rebuild, 48 drives in the nine-node configuration and 66 drives in the twelve nodes.
Notice again the near linear improvement in rebuild times as you increase the number of nodes and drives. As in the drive rebuild scenario, the node rebuild time observed also tends to approach a vanishing point for the varying data saturation levels.
As mentioned previously, PowerFlex scales to 1000+ nodes. Take a scenario where you need to affect an entire rack of servers and remain operational and recoverable (unthinkable in conventional architectures) and you see why our largest customers depend on PowerFlex.
If the above tests were done just to show off the best rebuild times, we would just run these systems without any actual other work occurring. However, that wouldn’t reflect a real-world scenario where the intention is to continue operating gracefully and still recover to full operational levels.
These tests were done with the PowerFlex default rebuild setting of one concurrent I/O per drive. For customers with more aggressive needs to return to fully protected, PowerFlex can be configured to accelerate rebuilds as a priority. To optimize rebuilds even more than illustrated, you can set the number of concurrent I/Os per drive to two or more or even unlimited. Since changing the number of I/Os per drive does affect latency and IOPS, which could adversely impact workloads, we chose to illustrate our default example that intentionally balances keeping workload performance high while doing the rebuild.
Using FIO* as a storage I/O generator, we ran these rebuild scenarios with ~750k random IOPS of activity on the 12 node configuration, ~600k random IOPS on the 9-nodes and ~400k on the 6-nodes, all while sustaining 0.5mS latency levels (cluster examples here can drive well over 1M IOPS at sub-mS levels). This represents a moderately heavy workload operating while we performed these tests. Even with the I/O generator running and the rebuild process taking place, the CPU load was approximately 20%. The I/O generator alone only consumed 8 to 10% of the available CPU capacity. Both CPU utilization figures underscores the inherent software defined infrastructure efficiency of PowerFlex that leaves a lot of available capacity to host application workloads. In this test case scenario, both the compute and storage occupied the same node (hyperconverged), but remember that we can also run a in 2-layer configuration using compute only and storage only nodes for asymmetrical scaling.
The systems used for these tests had the following configuration. Note that we used six drives per node in the R740xd chassis that can hold 24 drives, which means there were another 18 slots available for additional drives. As noted previously, more drives mean more parallel capabilities for performance and rebuild velocity.
PowerFlex delivers cloud scale performance with unrivaled grace under pressure reliability for delivering a software defined block storage product with six nines of availability. Be sure to read Part 1 of this blog “Resiliency Explained — Understanding PowerFlex's Self-Healing, Self-Balancing Architecture” to see the other protection architecture elements not covered here. For more information on our validated mission critical workloads like Oracle RAC, SAP HANA, MySQL, MongoDB, SAS, Elastic, VDI, Cassandra and other business differentiating applications, please visit our PowerFlex product site.
* FIO set for 8k, 20% random write, 80% random reads
Resiliency Explained — Understanding the PowerFlex Self-Healing, Self-Balancing Architecture
Wed, 15 Jul 2020 16:35:08 -0000|
Read Time: 0 minutes
My phone rang. When I picked up it was Rob*, one of my favourite PowerFlex customers who runs his company’s Storage Infrastructure. Last year, his CTO made the decision to embrace digital transformation across the entire company, which included a software-defined approach. During that process, they selected the Dell EMC PowerFlex family as their Software-Defined Storage (SDS) infrastructure because they had a mixture of virtualised and bare-metal workloads, needed a solution that could handle their unpredictable storage growth, and also one powerful enough to support their key business applications.
During testing of the PowerFlex system, I educated Rob on how we give our customers an almost endless list of significant benefits – blazingly fast block storage performance that scales linearly as new nodes are added to the system; a self-healing & self-balancing storage platform that automatically ensures that it always gives the best possible performance; super-fast rebuilds in the event of disk or node failures, plus the ability to engineer a system that will meet or exceed his business commitments to uptime & SLAs.
PowerFlex provides all this (and more) thanks to its “Secret Sauce” – its Distributed Mesh-Mirror Architecture. It ensures there are always two copies of your application data – thus ensuring availability in case of any hardware failure. Data is intelligently distributed across all the disk devices in each of the nodes within a storage pool. As more nodes are added, the overall performance increases nearly linearly, without affecting application latencies. Yet at the same time, adding more disks or nodes also makes rebuild times during those (admittedly rare) failure situations decrease – which means that PowerFlex heals itself more quickly as the system grows!
PowerFlex automatically ensures that the two copies of each block of data that gets written to the Storage Pool reside on different SDS (storage) nodes, because we need to be able to get a hold of the second copy of data if a disk or a storage node that holds the first block fails at any time. And because the data is written across all the disks in all the nodes within a Storage Pool, this allows for super-quick IO response times, because we access all data in parallel.
Data also gets written to disk using very small chunk sizes – either 1MB or 4KB, depending on the Storage Pool type. Why is this? Doing this ensures that we always spread the data evenly across all the disk devices, automatically preventing performance hot-spots from ever being an issue in the first place. So, when a volume is assigned to a host or a VM, that data is already spread efficiently across all the disks in all Storage Nodes. For example, a 4-Node PowerFlex system, with 3 volumes provisioned from it, will look something like the following:
Figure 1: A Simplified View of a 4-Node PowerFlex System Presenting 3 Storage Volumes
Now, here is where the magic begins. In the event of a drive failure, the PowerFlex rebuild process utilizes an efficient many-to-many scheme for very fast rebuilds. It uses ALL the devices in the storage pool for rebuild operations and will always rebalance the data in the pool automatically whenever new disks or nodes are added to the Storage Pool. This means that, as the system grows, performance increases linearly – which is great for future-proofing your infrastructure if you are not sure how your system will grow. But this also gives another benefit – as your system grows in size, rebuilds get faster!
Customers like Rob typically raise their eyes at that last statement – until we provide a simple example to get the point across – and then they have a lightbulb moment. Think about what happens if we used a 4-node PowerFlex system, but only had one disk drive in each storage node. All data would be spread evenly across the 4 Nodes, but we also have some spare capacity reserved, which is also spread evenly across each drive. This spare capacity is needed to rebuild data into, in the event of a disk or a node failure and it usually equates to either the capacity of an entire node or 10% of the entire system, whichever is largest. At a superficial level, a 4-Node system would look something like this:
Figure 2: A Simplified View of a 4-Node PowerFlex System & Available Dataflows
If one of those drives (or nodes) failed, then obviously we would end up rebuilding between the three remaining disks, one disk per node:
Figure 3: Our Simplified 4-Node PowerFlex System & Available Dataflows with One Failed Disk
Now of course, in this scenario, that rebuild is going to take some time to complete. We will be performing lots of 1MB or 4KB copies between the three remaining nodes, in both directions, as we rebuild into the spare capacity available on the remaining nodes & get back to having two copies of data in order to be fully protected again. It is worth pointing out here that a node typically contains 10 or 24 drives, not just one, so PowerFlex isn’t just protecting you from “a” drive failure, we’re able to protect you from a whole pile of drives. This is not your typical RAID card schema.
Now – let the magic of PowerFlex begin! What happens if we were to add a fifth storage node into the mix? And what happens when a disk or node fails in this scenario??
Figure 4: Dataflows in a Normally Running 5-Node PowerFlex System … & Available Dataflows with One Failed Disk or Node
It should be clear for all to see that we now have more disks - and nodes - to participate in the rebuild process, making the rebuild complete substantially faster than in our previous 4 node scenario. But PowerFlex nodes do not have just a single disk inside them - They typically have 10 or 24 drive slots, hence even for a small deployment with 4 nodes, each having 10 disks, we will have data placed intelligently and evenly across all 40 drives, configured as one Storage Pool. Now, with today’s flash media, that is a heck of a lot of performance capability available at your fingertips, that can be delivered with consistent sub-millisecond latencies.
Let me also highlight the “many-to-many” rebuild scheme used by each Storage Pool. This means that any data within a Storage Pool can be rebuilt to all the other disks in the same Pool. If we have 40 drives in our pool, it means that when one drive fails, the other 39 drives will be utilised to rebuild the data of the failed drive. This results in extremely quick rebuilds that occur in parallel, with minimum impact to application performance if we lose a disk:
Figure 5: A 40-disk Storage Pool, with a Disk Failure… Showing The Magic of Parallel Rebuilds
Note that we had to over-simplify the dataflows between the disks in the figure above, because if we tried to show all the interconnects at play, we would simply have a tangle of green arrows!
Here’s another example to explain the difference between PowerFlex and conventional RAID-type drive protection. The initial rebuild test on an empty system usually takes little more than a minute for the rebuild to complete. This is because PowerFlex will only ever rebuild chunks of application data, unlike a traditional RAID controller, which will rebuild disk blocks whether they contain data or not. Why waste resources rebuilding empty zeroes of data when you need to repair from a failed disk or node as quickly as possible?
The PowerFlex Distributed Mesh-Mirror architecture is truly unique and gives our customers the fastest, most scalable and most resilient block storage platform available on the market today! Please visit www.DellTechnologies.com/PowerFlex for more information.
* Name changed to protect the innocent!
PowerFlex and CloudLink: A Powerful Data Security Combination
Wed, 08 Jul 2020 14:06:22 -0000|
Read Time: 0 minutes
Security and operational efficiency continue to top IT executives’ datacenter needs lists. Dell Technologies looks at the complete solution to achieve both so customers can focus on their business outcomes.
Dell Technologies’ PowerFlex is a software-defined storage platform designed to significantly reduce operational and infrastructure complexity, empowering organizations to move faster by delivering flexibility, elasticity, and simplicity with predictable performance and resiliency at scale. PowerFlex provides a unified fabric of compute and storage with scale out flexibility for either of these ingredients to match workload requirements with full lifecycle simplification provided by PowerFlex Manager. Dell Technologies’ CloudLink, data encryption and key management solution, supports workload deployments from edge to core to cloud, providing a perfect complement to the PowerFlex family that enables flexible encryption tailored to the modern datacenter’s needs.
With increasing regulatory and compliance requirements, more and more customers now realize how critical encryption is to securing their data centers and need solutions that are built into their platforms. CloudLink, integrated with PowerFlex, provides reliable data encryption and key management in one solution with the flexibility to satisfy most customer's needs.
CloudLink’s rich feature set integrates directly into the PowerFlex platform allowing our customers access to CloudLink's encryption and key management functionality, including data at rest and data in motion encryption, full key lifecycle management, and lightweight multi-tenancy support.
CloudLink provides software-based data encryption and a full set of key management capabilities for PowerFlex, including:
SEDs offer high performant hardware-based Data-at-Rest Encryption ensuring that all data in the deployment is safe from prying eyes. On a PowerFlex platform, CloudLink can manage the keys for each individual drive and store them safely within our encrypted vault where customers can leverage CloudLink's full key lifecycle management feature set. This option, also integrated and deployable with PowerFlex Manager, is ideal for your sensitive data assets that require high-performance.
Sometimes Data-at-Rest Encryption is not enough, and our customers need to encrypt their virtual machines. CloudLink provides VM encryption by deploying agents on the guest OS. CloudLink's agent encryption gives our customers the ability to move encrypted VMs throughout their environment making tasks such as replication, deployment to production from QA, or out to satellite offices, safer and easier.
CloudLink’s encryption for machines agent can also encrypt data volumes on bare metal servers allowing customers to keep their data safe even when deployed on legacy hardware.
When 3rd party encryptors need external key management, they turn to solutions that implement KMIP (Key Management Interoperability Protocol). This open standard defines how encryptors and key managers communicate. CloudLink implements the KMIP protocol both as a client and a server to provide basic key storage and management for encryptors such as VMware’s native encryption features, or to plug-in to a customer’s existing keystore. These capabilities provide the flexibility required for today’s heterogenous environments.
There is a sea change occurring in data centers brought on by the relatively new technology of containers. 451 Research, a global research and advisory firm, released the results of its 2020 Voice of the Enterprise survey, which indicates that as companies consider the move to containerized deployments, security and compliance concerns are top of mind. However, for so many of the new container technology products from which to choose, proper security is not built-in.
Given the extreme mobility of containers, keeping customers’ data safe as applications move throughout a deployment – especially within the cloud – is a challenge. To address this gap, we introduced file volume encryption for Kubernetes container deployments in our CloudLink 7.0 release, which has been validated with PowerFlex 3.5. Our container encryption functionality is built on the same full lifecycle key management and agent-based encryption architectural model that we currently offer for PowerFlex. We deploy an agent within the container such that it sits directly on the data path. As the data is saved, we intercept it and make sure it is encrypted as it travels to and then comes to rest in the data store.
Hand in hand with PowerFlex, CloudLink provides data encryption and key management with unmatched flexibility, superior reliability, and simple and efficient operations complete with support from Dell as a complete solution. The PowerFlex Manager is a comprehensive IT operations and lifecycle management tool that drastically simplifies management and ongoing operation. CloudLink is integrated into this tool to make the deployment of the CloudLink agent a natural part the PowerFlex management framework.
PowerFlex: The advantages of disaggregated infrastructure deployments
Mon, 29 Jun 2020 18:57:26 -0000|
Read Time: 0 minutes
For several years, there has been a big push from quite a number of IT vendors towards delivering solutions based on Hyperconverged Infrastructure or HCI. The general concept of HCI is to take the three primary components of IT, compute, network and storage, and deliver them in a software defined format within a building block, normally an x86 based server. These building blocks are then joined together to create a larger, more resilient environment. The software defined components are typically a hypervisor to provide compute, virtual adapters and switches for networking, along with some software that takes the local disks attached to the server, combines them with the disks directly attached to the other building blocks and presents them as a virtual storage system back to the environment.
The HCI approach is attractive to customers for a variety of reasons:
There are of course scenarios where the HCI model does not fit, the limitations are frequently associated with the software defined storage part of the environment, situations such as the following:
Several HCI vendors have attempted to address these points but often their solutions to the issues involve a compromise.
What if there was a solution that provided software defined storage that was flexible enough to meet these requirements without compromise?
Step forward PowerFlex, a product flexible enough to be deployed as an HCI architecture, a disaggregated architecture (separate compute and storage layers managed within the same fabric), or a mixture of the two.
So how can PowerFlex be this flexible?
It is all about how the product was initially designed and developed, it consists predominantly of three separate software components:
Each of these components can be installed across a cluster of servers in a variety of ways in order to create flexible deployment scenarios. The SDC and SDS components communicate with one another over a standard TCP/IP network to form an intelligent fabric, this is all overseen by the MDM, which is not in the data path.
Some pictures will help illustrate this far better than I can with words.
By installing the SDC (the C in a yellow box) and the SDS (the S in a green box) on to the same server, an HCI environment is created.
If the SDC and SDS are installed on dedicated servers, a disaggregated infrastructure is created
And because PowerFlex is entirely flexible (the clue is in the name), HCI and disaggregated architectures can be mixed within the same environment.
What are the advantages of deploying a disaggregated environment?
Whilst HCI deployments are ideal for environments where compute requirements and storage capacity increases remain in lockstep, there are many use cases where compute and storage needs grow independently, PowerFlex is capable of serving both requirements.
PowerFlex was built to allow this disaggregation of resources from day one, which means that there is no downside to performance or capacity when storage nodes are added to existing clusters, in fact there are only positives, with increased performance, capacity and resilience, setting PowerFlex apart from many other software defined storage products.
Dell EMC PowerFlex and VMware Cloud Foundation for High Performance Applications
Thu, 25 Jun 2020 13:10:33 -0000|
Read Time: 0 minutes
The world in 2020 has shown all industries that innovation is necessary to thrive in all conditions. VMware Cloud Foundation (VCF) hybrid cloud platform was crafted by innovators who realize the biggest asset our customers have is their information technology and the data that runs the business. The VCF offering takes the complexity out of operationalizing infrastructure to enable greater elasticity, growth, and simplification through improved automation. VCF enables options available using on-premises and multi-cloud deployments to address ever changing enterprise needs.
VMware included design factors that anticipated customers’ use of varying storage options in the flexibility of implementing VCF. VMware vSAN is the standard for VCF hyperconverged infrastructure (HCI) deployments and is directly integrated into vSphere and VCF. For those circumstances where workloads or customer resource usage require alternative storage methods, VMware built flexibility into the VCF storage offering. Just as we see a wide variety in desktop computing devices, one size doesn't fit all applies to the enterprise storage products as well. Dell Technologies’ PowerFlex (formerly VxFlex) provides a software-defined mechanism to add a combination of compute and storage with scale out flexibility. As customers look to software-defined operational constructs for agility, PowerFlex provides an adjustable means to add the right balance of storage resources while enabling non-disruptive additions without painful migrations as demands increase.
Joining the Dell Technologies Cloud family as a validated design, Dell EMC PowerFlex helps customers simplify their path to hybrid cloud by combining the power of Dell EMC infrastructure with VMware Cloud Foundation software as supplemental storage. As a high-performance, scale out, software-defined block storage product, PowerFlex provides a combination of storage and compute in a unified fabric that's well equipped to service particularly challenging workloads. The scalability of compute and/or storage in a modular architecture provides an asymmetrical (2-layer) option to add capacity to either compute or storage independently. PowerFlex makes it possible to transform from a traditional three-tier architecture to a modern data center without any trade-offs between performance, resilience or future expansion.
PowerFlex significantly reduces operational and infrastructure complexity, empowering organizations to move faster by delivering flexibility, elasticity, and simplicity with predictable performance and resiliency at scale for deployments. PowerFlex Manager is a key element of our engineered systems providing a full lifecycle administration experience for PowerFlex from day 0 through expansions and upgrades which is independent, but complementary to the full stack life cycle management available through VCF via SDDC Manager. A cornerstone value proposition of VCF is administering the lifecycle management of OS upgrades, vSphere updates, vRealize monitoring, automation and NSX administration. PowerFlex manager works in parallel with VCF to deliver a comprehensive lifecycle experience for the physical ingredients and for the PowerFlex software-define storage layer. PowerFlex also offers a vRealize Operations plug-in for a unified monitoring capability from VMware vRealize Suite which is included in most VCF editions. From a storage management perspective, PowerFlex utilizes a management system that complements VCF and VMware vSphere by working within the appropriate vCenter management constructs. PowerFlex Manager provides the administration of PowerFlex storage functions, while VCF and vCenter manages the allocation of LUNs to provisioned VMFS file systems to provide data stores for the provisioned workloads.
PowerFlex systems enables customers to scale from a small environment to enterprise scale with over a thousand nodes. In addition, it provides enterprise grade data protection, multi-tenant capabilities, and add-on enterprise features such as QoS, thin provisioning, compression and snapshots. PowerFlex systems deliver the performance and time-to-value required to meet the demands of the modern enterprise data center.
Does Supplemental Storage Mean Slow or Light Workload Use Cases?
PowerFlex provides a Dell Technologies validated design as a supplemental storage platform for VCF, unlocking the value of PowerFlex to be realized by customers within the VCF environment. By providing sub-millisecond latency, high IOPS and high throughput with linearity as nodes join the fabric, the result is a very predictable scaling profile that accelerates the VCF vision within the datacenter.
PowerFlex, as a part of VCF, can help solve for even the most demanding of applications. Using the supplemental capabilities to service workloads with the highest of efficiency provides a best of class performance experience. Some illustrative examples of demanding application workloads validated with PowerFlex, independent of VCF, include the following:
SAP HANA certified for PowerFlex integrated rack in both 4-socket and 2-socket offerings (certification details). Highly efficient in hosting up to six production HANA instances per 4-socket server. Our capabilities outperform external competitors by hosting 2x the capacity. The Configuration and Deployment Best Practices for SAP HANA white paper provides details. While this white paper illustrates a single layer architecture, even better performance characteristics are achievable using the VCF aligned 2-layer architectural implementation of PowerFlex.
Oracle RAC & Microsoft SQL
Flexibility to run compute and storage on separate hardware results in significant reduction of database licensing cost.
Validated/certified by SAS for running SAS mixed analytics workloads (white paper) providing an average throughput of 210 MBs per core (40% greater than their recommended 150 MB/sec needed for certification).
The validated solution (white paper) with Elastic provides customers with the required high-performance, scalable, block-based IO with flexible deployment options in multiple operating environments (Windows, Linux, Virtualized/Bare Metal). Elastic validated the efficiency of PowerFlex using only three compute and 4 storage nodes to deliver ~1 billion indexing events measured by Elastic’s Rally benchmarking tool.
The validated PowerFlex solution for Epic delivers 6x9’s availability and high performance for critical the EPIC hyperspace workloads while simultaneously enabling hosting the VDI with the operational and analytical databases for a completely integrated infrastructure option.
For customers deploying Kubernetes container-based database deployments like Cassandra, PowerFlex provides 300,000 operations/second for 10 million operations (Read intensive operations) with avg read latency of 1ms on just eight nodes.
PowerFlex gives Dell Technologies the ability to help customers address diverse infrastructure needs. For more information on all of the Dell Technologies storage options with Cloud Validated Designs for VMware Cloud Foundation, please view our white paper. The implementation guide for using PowerFlex for supplemental storage provides the simple steps to provide complementary storage options for VCF deployments. For more information on the PowerFlex product family and workload solutions, please see the product page here. The PowerFlex White Paper - Technical Overview also provides a comprehensive perspective how organizations can begin changing the way they think about a modern data center architecture. Please contact your local Dell sales representative for more information.
Other pre-tested Dell Technologies Storage products validated for VMware Cloud Foundation that provide the capabilities to independently scale storage and compute include the offerings below. You can find more details in the Dell Technologies Cloud Validated Designs document.