The latest news about PowerFlex releases and updates
Introducing the PowerFlex Management Pack for vRealize Operations
Mon, 02 Nov 2020 13:09:42 -0000|
Read Time: 0 minutes
Achieving operation efficiency in today’s modern cloud infrastructure brings automation to the forefront. Centralized visibility provides a key piece of the insight needed to understand if there are operational inefficiencies for taking actions that mitigate business disruption.
We are pleased to share the general availability of Dell EMC PowerFlex Management Pack for vRealize Operations 8.x. The PowerFlex MP for vROps extends the visibility of PowerFlex systems into vROps where IT can monitor their complete data center and cloud operations. It is available to all PowerFlex rack and appliance customers at no additional cost. This brings additional value to the comprehensive IT operations management functionality delivered by PowerFlex Manager that enables full life cycle management of the unified compute and software defined storage solution.
The management pack queries and collects key PowerFlex metrics for storage, compute, networking, and server hardware using APIs and ingests into vROps that can be visualized using the out-of-the-box dashboards. It also provides a detailed system level view that shows the health status and relationship between different components of the PowerFlex system.
Dashboards: The management pack includes 13 default dashboards showing details of PowerFlex storage, PowerFlex Manager, PowerFlex nodes, network switches, ESXi hosts, and clusters. These configurable dashboards provide user customizable data displays that adjust to meet a wide variety of requirements.
Predefined symptoms and alert definitions: The management pack includes 166 symptom definitions and 152 alert definitions based on engineering best practices for the PowerFlex systems. Symptoms and alerts can be customized by the user to meet the demand of their environment.
Historical data: This is available for all PowerFlex Adapter resource kinds. This data provides a view of consumption over time and includes capacity forecasting based on usage for PowerFlex storage.
Network topology and relationship: The topology tree functionality available in vROps is extremely useful when mapping relationships between nodes, network interfaces, switch port, VLAN, port-channel, and vPC.
Detailed metric collection: In addition to the default dashboards, users have the option of drilling into specific metrics for nearly all available data from the components of PowerFlex system, even if it is not included in a dashboard.
Multiple PowerFlex systems awareness: Ability to group and differentiate multiple PowerFlex systems.
PowerFlex node type differentiation: Ability to identify and classify compute, storage, hyperconverged, and management controller nodes.
PowerFlex Details: This dashboard shows all the PowerFlex storage KPIs with historical data providing a view of storage performance utilization over time.
PowerFlex Node Summary: You can monitor the health status of all your PowerFlex nodes and its hardware components in this dashboard.
PowerFlex Networking Performance: This dashboard shows network KPIs like throughput, errors, packet discards with historical data providing a view of network utilization over time.
For customers who have already invested in vRealize Operations, this management pack is a great value add to monitor their PowerFlex systems. It is an end-to-end monitoring and alerting solution for PowerFlex infrastructure using vROps. It helps customers significantly in terms of capacity planning based on the historical data of resource consumption over time. It also helps to identify usage trends and provides insight to understand if there are operational issues/ inefficiencies for taking necessary actions to avoid service outages and mitigate business disruption. This integration with VMware vRealize Operations reduces operational complexity by using a unified platform to monitor and manage private data center infrastructure, as well as hybrid and multi-cloud environments.
Demystifying CSI plug-in for PowerFlex (persistent volumes) with Red Hat OpenShift
Wed, 14 Oct 2020 18:12:01 -0000|
Read Time: 0 minutes
The Container Storage Interface (CSI) is a standard for exposing file and block storage to containerized workloads on Kubernetes, OpenShift and so on. CSI helps third-party storage providers (for example PowerFlex) to write plugins for OpenShift to consume storage from backends as persistent storage.
CSI driver for Dell EMC VxFlex OS can be installed using Dell EMC Storage CSI Operator. It is a community operator and can be deployed using OperatorHub.io.
Master nodes components do not communicate directly with CSI driver. It interacts only with API server on Master nodes. It MUST watch the Kubernetes API and trigger the appropriate CSI operations against it. Kubelet discovers CSI drivers using kubelet plug-in registration mechanism. It directly issues calls to CSI driver.
External Provisioner –The CSI external provisioner is a sidecar container that watches the k8s API server for PersistentVolumeClaim objects. It calls CreateVolume against the specified CSI endpoint to provision a volume.
External Attacher – The CSI external attacher is a sidecar container that watches the API server for VolumeAttachment objects and triggers controller [Publish|Unpublish] volume operations against a CSI endpoint.
CSI Controller plug-in – The controller component can be deployed as a Deployment or StatefulSet on any node in the cluster. It consists of the CSI driver that implements the CSI Controller service.
CSI Identity – It enables k8s components and CSI containers to identify the driver.
CSI Node Plugin –The node component should be deployed on every node in the cluster through a DaemonSet. It consists of the CSI driver that implements the CSI Node service and the node driver registrar sidecar container.
Storage within OpenShift Container Platform 4.x is managed from worker nodes. The CSI API uses two new resources: PersistentVolume (PV) and PersistentVolumeClaim (PVC) objects.
Persistent Volumes – Kubernetes provides physical storage devices to the cluster in the form of objects called Persistent Volumes.
Persistent Volume Claim – This object lets pods use storage from Persistent Volumes.
Storage Class – This object helps you create PV/PVC pair for pods. It stores information about creating a persistent volume.
- name: powerflexos
- key: csi-vxflexos.dellemc.com/X_CSI_VXFLEXOS_SYSTEMNAME
- name: powerflex-xfs
- key: csi-vxflexos.dellemc.com/X_CSI_VXFLEXOS_SYSTEMNAME
Static Provisioning – This allows you to manually make existing PowerFlex storage available to the cluster.
Dynamic Provisioning - Storage volumes can be created on-demand. Storage resources are dynamically provisioned using the provisioner that is specified by the StorageClass object.
Retain Reclaiming – Once PersistentVolumeClaim is deleted, the corresponding PersistentVolume is not deleted rather moved to Released state and its data can be manually recovered.
Delete Reclaiming – It is the default reclaim policy and unlike Retain policy persistent volume is deleted.
Access Mode - ReadWriteOnce -- the volume can be mounted as read/write by a single node.
Supported FS - ext4/xfs.
Raw Block Volumes: Using Raw block option, PV can be attached to pod or app directly without formatting with ext4 or xfs file system.
A Case for Repatriating High-value Workloads with PowerFlex Software-Defined Storage
Wed, 26 Aug 2020 18:33:51 -0000|
Read Time: 0 minutes
Kent Stevens, Product Management, PowerFlex
Brian Dean, Senior Principal Engineer, TME, PowerFlex
Michael Richtberg, Chief Strategy Architect, PowerFlex
We observe customers repatriating key applications from the Cloud, help you think about where to run your key applications, and explain how PowerFlex’s unique architecture meets the demands of these workloads in running and transforming your business
For critical software applications you depend upon to power core business and operational processes, moving to “The Cloud” might seem the easiest way to gain the agility to transform the surrounding business processes. Yet we see many of our customers making the move back home, back “On-Prem” for these performance-sensitive critical workloads – or resisting the urge to move to The Cloud in the first place. PowerFlex is proving to deliver agility and ease of operations for the IT infrastructure for high-value, large-scale workloads and data-center consolidation, along with a predictable cost profile – as a Cloud-like environment enabling you to reach your business objectives safely within your own data center or at co-lo facilities.
IDC recently found that 80% of their customers had repatriation activities, and 50% of public-cloud based applications were targeted to move to hosted-private cloud or on-premises locations within two years(1). IDC notes that the main drivers for repatriation are security, performance, cost, and control. Findings reported by 451 Research(2) show cost and performance as the top disadvantages when comparing on-premises storage to cloud storage services. We’ve further observed that core business-critical applications are a significant part of these migration activities.
If you’ve heard the term “data gravity,” which relates to the difficulty in moving data to and from the cloud and that may only be part of the problem. “Application” gravity is likely a bigger problem for performance sensitive workloads that struggle to achieve the required business results because of scale and performance limitations of cloud storage services.
Transformation is the savior of your business – but a problem for your key business applications
Business transformation impacts the data-processing infrastructure in important ways: Applications that were stable and seldom touched are now the subject of massive changes on an ongoing basis. Revamped and intelligent business processes require new pieces of data, increasing the storage requirements and those smarts (the newly automated or augmented decision-making) require constant tuning and adjustments. This is not what you want for applications that power your most important business workflows that generate your profitability. You need maximum control and full purview over this environment to avoid unexpected disruptions. It’s a well-known dilemma that you must change the tires while the car is driving down the road – and today’s transformation projects can take this to the extreme.
The infrastructure used to host such high-profile applications – computing, storage and networking – must be operated at scale yet still be ready to grow and evolve. It must be resilient, remain available when hardware fails, and be able to transform without interruption to the business.
Does the public cloud deliver the results you expected?
Do your applications require certain minimum amounts of throughput? Are there latency thresholds you consider critical? Do you require large data capacities and the ability to scale as demands grow? Do require certain levels of availability? You may assume all these requirements come with a “storage” product offered by the public cloud platforms, but most fall short of meeting these needs. Some require over-provisioning to get better performance. High availability options may be lacking. The highest performing options have capacity scale limitations and can be prohibitively expensive. If you assume what you’ve been using on-prem comes from a hyperscaler, you may be quite surprised that there are substantial gaps that require expensive application rearchitecting to be “cloud native” which may become budget busters. These public cloud attributes can lead to “application gravity” gaps.
While the agility of it is tempting, the unexpected costliness of moving everything to the public cloud has turned back more than one company. When evaluating the economics and business justification for Cloud solutions, many costs associated with full-scale operations, spikes in demand or extended services can be hard to estimate, and can turn out to be large and unpredictable.
The full price of cloud adoption must account for the required levels of resiliency, management infrastructure, storage and analytics for operational data, security solutions, and scaling up the resources to realistic production levels. Recognizing all the necessary services and scale may undermine what might have initially appeared to be a solid cost justification. Once the budget is established, active effort and attention must be devoted to monitoring and oversight. Adapting to unexpected operational events, such as bursting or autoscaling for temporary spikes in workload or traffic, can bring unforeseen leaps in the monthly bill. Such situations can be especially hard to predict and plan for – and very difficult to control.
You want the speed, convenience and elasticity of running in the cloud - but how do you ensure that agility while staying within the necessary bounds of cost and oversight? Truly transformative infrastructure allows businesses to consolidate compute and storage for disparate workloads onto a single unified infrastructure to simplify their environment, increase agility, improve resiliency and lower operational costs. And your potential payoff is big with far easier scaling, more efficient hardware utilization, and less time spent figuring out how to get things right or tracking down issues that complicate disparate system architectures.
Software-Defined is the Future
IDC Predicts that by 2024, software-defined infrastructure solutions will account for 30% of storage solutions(3). At the heart of the PowerFlex family, and the enabler of its flexibility, scale and performance is PowerFlex software-defined storage. The ease and reliability of deployment and operation is provided by PowerFlex Manager, an IT operations and lifecycle management tool for full visibility and control over the PowerFlex infrastructure solutions.
PowerFlex’s unmatched combination of flexibility, elasticity, and simplicity with predictable high performance - at any scale - makes it ideally suited to be the common infrastructure for any company. Utilizing software defined storage (SDS) and hosting multiple heterogeneous computing environments, PowerFlex enables growth, consolidation, and change with cloud-like elasticity – without barriers that could impede your business.
The resulting unique architecture of the PowerFlex family easily meets the large-scale, always-on requirements of our customers’ core enterprise applications. The power and resiliency of the PowerFlex infrastructure platforms handle everything from high-performance enterprise databases, to web-scale transaction processing, to demanding business solutions in various industries including healthcare, utilities and energy. And this includes the new big-data and analytical workloads that are quickly augmenting the core applications as the business processes are being transformed.
PowerFlex: A Unique Platform for Operating and Transforming Critical Applications
PowerFlex provides the flexibility to utilize your choice of tools and solutions to drive your transformation and consolidation, while controlling the costs of the relentless expansion in data processing. PowerFlex provides the modularity to adapt and grow efficiently while providing the manageability to simplify your operations and reduce costs. It provides the scalable infrastructure on-premises to allow you focus on your business operations. PowerFlex on-demand options by the end of 2020 enable an elastic OPEX consumption model as well.
As your business needs change, PowerFlex provides a non-disruptive path of adaptability. As you need more compute, storage or application workloads, PowerFlex modularly expands without complex data migration services. As your application infrastructure needs change from virtualization to containers and bare metal, PowerFlex can mix and match these in any combination necessary without needing physical changes or cluster segmentation. PowerFlex provides future-proof capabilities that keep up with your demands with six nines of availability and linear scalability.
With the dynamic new pace of growth and change, PowerFlex can ensure you stay in charge while enabling the agility to adapt efficiently. PowerFlex enables you to leverage the advantages of oversight and cost-effectiveness of the on-premises environment with the ability to meet transformation head-on.
1 IDC Cloud Repatriation Accelerates in a Multi-Cloud World, July 2018
2 451 Research, 2020 Voice of the Enterprise
3 IDC FutureScape: Worldwide Enterprise Infrastructure 2020 Predictions, October 2019
PowerFlex Native Asynchronous Replication RPO with Oracle
Mon, 17 Aug 2020 15:52:45 -0000|
Read Time: 0 minutes
PowerFlex software-defined storage platform provides a reliable, high-performance foundation for mission-critical applications like Oracle databases. In many of these deployments, replication and disaster recovery have become a common practice for protecting critical data and ensuring application uptime. In this blog, I will be discussing strategies for replicating mission-critical Oracle databases using Dell EMC PowerFlex software-defined storage.
Customers require Disaster Recovery and Replication capabilities to meet mission-critical business requirements where SLAs require the highest uptime. Customers also want the ability to quickly recover from physical or logical disasters to ensure business continuity in the event of disaster and be able to bring up the applications in minimal time without impact to data. Replication means that the same data is available at multiple locations. For Oracle database environments, it is important to have local and remote replicas of application data which are suitable for testing, development, reporting, and disaster recovery and many other operations. Replication improves the performance and protects the availability of Oracle database application because the data exists in another location. Advantages of having multiple copies of data being present across geographies is that, critical business applications will continue to function if the local Oracle database server experiences a failure.
Replication enables customers in various scenarios such as:
PowerFlex is a software-defined storage platform designed to significantly reduce operational and infrastructure complexity, empowering organizations to move faster by delivering flexibility, elasticity, and simplicity with predictable performance and resiliency at scale. The PowerFlex family provides a foundation that combines compute as well as high performance storage resources in a managed unified fabric.
PowerFlex is designed to provide extreme performance and massive scalability up to 1000s of nodes. It can be deployed as a disaggregated storage / compute (two-layer), HCI (single-layer), or a mixed architecture. PowerFlex inclusively supports applications ranging from bare-metal workloads and virtualized machines to cloud-native containerized apps. It is widely used for large-scale mission-critical applications like Oracle database. For information about best practices for deploying Oracle RAC on PowerFlex, see Oracle RAC on PowerFlex rack.
PowerFlex also offers several enterprise-class native capabilities to protect critical data at various levels:
PowerFlex software consists of a few important components - Meta Data Manager (MDM), Storage Data Server (SDS), Storage Data Client (SDC) and Storage Data Replicator (SDR). MDM manages the PowerFlex system as a whole, which includes metadata, devices mapping, volumes, snapshots, system capacity, errors and failures, system rebuild and rebalance tasks. SDS is the software component that enables a node to contribute its local storage to the aggregated PowerFlex pool. SDC is a lightweight device driver that exposes PowerFlex volumes as block devices to the applications and hosts. SDR handles the replication activities. PowerFlex has a unique feature called Protection Domain. A Protection Domain is a logical entity that contains a group of SDSs. Each SDS belongs to only one Protection Domain.
Figure 1. PowerFlex asynchronous replication between two systems
Replication occurs between two PowerFlex systems designated as peer systems. These peer systems are connected using LAN or WAN and are physically separated for protection purposes. Replication is defined in scope of a protection domain. All objects which participate in replication are contained in the protection domain, including volumes in Replication Consistency Group (RCG). Journal capacity from storage pools in the protection domain is shared among RCGs in the protection domain.
The SDR handles replication activities and manages I/O of replicated logical volumes. The SDR is deployed on the same server as SDS. Only I/Os from replicated volumes flows through SDR.
Figure 2. PowerFlex replication I/O flow between two systems
For detailed information about Architecture Overview, see Dell EMC PowerFlex: Introduction to Replication White Paper.
It is important to note that this approach to replication allows PowerFlex to support replication at extreme scales. As the number of nodes contributing storage are scaled, so are the SDR instances. As a result, this replication mechanism can scale effortlessly from 4 to 1000s of nodes while delivering RPOs as low as 30 seconds and meeting IO and throughput requirements.
The following illustration demonstrates that the volumes participating in replication are grouped to form the Replication Consistency Group (RCG). RCG acts as the logical container for the volumes.
Figure 3. PowerFlex replication with Oracle database
Depending on the scenario, we can create multiple RCGs for each volume pair or combine multiple volume pairs in a single RCG.
In the above Oracle setup, PowerFlex System-1 is the source and PowerFlex System-2 is the destination. For replication to occur between the source and target, the following criteria must be met:
The PowerFlex replication is designed to recover from as low as a 30 seconds RPOs minimizing the data-loss if there is a disaster recovery. During creation of RCG, users can specify RPO starting from 30 seconds to maximum of 60 minutes.
All the operations performed on source will be replicated to destination within the RPO. To ensure RPO compliance, PowerFlex replicates at least twice for every RPO period. For example, setting RPO to 30 seconds means that PowerFlex can immediately return to operation at the target system with only 30 seconds of potential data loss.
The following figures depicts the replication scenario under steady state of workload:
Figure 4. 100% RPO compliance for RPO of 30s for an Oracle database during a steady application workload
Figure 5. Replication dashboard view of PowerFlex
In the case of disaster recovery, the entire application can be up and running by failover to secondary, with less than 30 seconds of data loss.
When we do a planned switchover or failover, the volumes on secondary system are automatically changed to read-write access mode and the volumes on source will be changed to read-only. Consequently, we can bring up Oracle database on secondary by setting up the Oracle environment variables and starting the database.
Once we have RCG in the failover or switchover mode, user can decide how to continue with replication:
PowerFlex also provides various other options:
PowerFlex native volume replication is a unique solution and provides customers with easy to configure and setup without worrying about disaster.
Irrespective of workload and application, it is designed to support massive scale while providing RPOs as low as 30 seconds.
For more information, please visit: DellTechnologies.com/PowerFlex.
Grace Under Pressure — PowerFlex Rebuild Superpowers
Tue, 04 Aug 2020 20:11:02 -0000|
Read Time: 0 minutes
In our first blog in this series, “Resiliency Explained — Understanding the PowerFlex Self-Healing, Self-Balancing Architecture,” covered an overview of how the PowerFlex system architecture provides superior performance and reliability. Today we’ll take you through another level of detail with specific examples of recoverability.
Warning: Information covered in this blog may leave you wanting for similar results from other vendors.
PowerFlex platform possess some incredible superpowers that deliver performance results that run some of the world’s most demanding applications. But what happens when you experience an unexpected failure like losing a drive, a node, or even a rack of servers? Even planned outages for maintenance can result in vulnerabilities or degraded performance levels, IF you use conventional data protection architectures like RAID.
Just a reminder, PowerFlex is a software defined storage system that delivers the compute and storage system in a unified fabric with the elasticity to scale either compute, storage or both to fit the workload. PowerFlex uses all-flash direct attached media located on standard x86 servers utilizing industry standard HBA adapters and 10 GB/s or higher NICs that interconnect servers. The systems scale from 4 nodes to multi-rack 1000+ nodes while increasing capacity, linearly increasing IOPS, all while sustaining sub-mS latency.
PowerFlex takes care of dynamic data placement that ensures there are NO hot spots, so QOS is a fundamental design point and not an after-thought bolt-on “fix” for a poor data architecture scheme; there’s no data locality needed. It handles the location of data to ensure there are no single points of failure and dynamically re-distributes blocks of data if you lose a drive, add a node, take a node off line, have a server outage (planned or unplanned) containing a whole tray of drives, and automatedly load balances the placement of data as storage use changes over time or with node expansion.
The patented software architecture underlying PowerFlex doesn’t use a conventional RAID protection mechanism. RAID serves a purpose, and even options like erasure code have their place in data protection, but what’s missing in these options? Let’s compare RAID and PowerFlex protection mechanisms:
Think of RAID as a multi-cup layout where you’re looking to ensure each write places data in multiple cups. If you lose a cup, you don’t necessarily re-arrange the data. You’re protected from data loss, but without the re-distribution, you’re still operating in a deprecated state and potentially vulnerable to additional failures until the hardware replacement occurs. If you want more than one level of cup failure, you have multiple writes to get multiple cups which creates more overhead (particularly in a software-defined storage versus a hardware RAID controller-based system). It still only takes care of data protection and not necessarily performance recovery.
Think of the PowerFlex architectural layout of data like a three-dimensional checkerboard where we ensure the data placement keeps your data safe. With the checkerboard layout, we can quickly re-arrange the checkers if you lose a box on the board or a row/column or even a complete board of checkers. Re-arranging the data to ensure there’s always two copies of the data for on-going protection and restoration of performance. The three-dimensional aspect comes from all nodes and all drives participating in the re-balancing process. The metadata management system seamlessly orchestrates re-distribution and balancing data placement.
Whether the system has a planned or unplanned outage or a node upgrade or replacement, this automatic rebalancing happens rapidly because every drive in the pool participates. The more nodes and the more drives, the faster the process of reconstituting any data rebuilding processes. In the software defined PowerFlex solution there’s no worrying about a RAID level or the performance trade-offs; it’s just taken care of for you seamlessly in the background without the annoying complications RAID introduces or the need any specialized hardware controllers.
PowerFlex looks at actual data rather than treating the whole drive capacity as what needs recovering. In this example, a drive failure occurs. The data levels illustrated here represent the total used capacity in these 6, 9 or 12 node configuration examples (we can scale to over 1,000 nodes). The 25%, 50% and 75% levels show relative rebuild times for this 960GB SAS SSD to return to restore the data to a full heathy state (re-protected).
We’re showing you a rebuild scenario to emphasize the performance, but taking it to another level, you wouldn’t urgently need to replace the drive as we leverage the data redistribution to other drives for protection and sustaining performance while using virtual spare space provided by all of the drives to pick up the gap. Unlike RAID, we don’t need to replace the drive to return the system to full health. You can replace the drive when it’s convenient.
Notice a few things:
This illustrates what happens when you have 35, 53, and 71 drives participating in the parallel rebuild process for the six, nine and twelve node configurations, respectively.
Node Rebuild (6 drives)
Here we show an example using a similar load level of data on the nodes. One of those nodes contains six drives for a total rebuild of 5.76TB. The entire cluster of drives participates in taking over the workloads, automatically rearranging the data placement, and returning the cluster to always having two copies. Just as in the above drive replacement rebuild, the process leverages all the drives from the cluster to swarm take on the rebuild process to return to a fully protected state. That means for the six node configuration there are 30 drives participating in the parallelized rebuild, 48 in the nine node configuration, and 66 drives in the twelve nodes.
Notice again the near linear improvement in rebuild times as you increase the number of nodes and drives. As in the drive rebuild scenario, it also nears a vanishing point of little difference in the rebuild times for the data saturation level.
As mentioned previously, PowerFlex scales to 1000+ nodes. Take a scenario where you need to affect an entire rack of servers and remain operational and recoverable (unthinkable in convention architectures) and you see why our largest customers depend on PowerFlex.
If the above tests were done just to show off the best rebuild times, we would just run these systems without any actual other work occurring, but that wouldn’t reflect a real-world scenario where the intention is to continue operating gracefully and still recover to full operational levels.
These tests were done with the PowerFlex default setting of one I/O per drive. For customers with more aggressive needs to return to fully protected, PowerFlex can go accelerate rebuilds as a priority. To optimize rebuilds even more than illustrated, you can set the number of I/O’s per drive to two or more or even unlimited. Since this does affect latency and IOPS, which could adversely impact workloads, we chose to illustrate our default example that intentionally balances keeping workload performance high while doing the rebuild.
Using FIO* as a workload generator, we ran these rebuild scenarios with ~750k IOPS of activity while sustaining 0.5mS latency levels (cluster examples here can drive well over 1M IOPS at sub-mS levels). This represents a moderately heavy workload operating while we performed these tests. Even with the workload occurring, and the rebuild process taking place, the CPU load was approximately 20%. The workload alone only consumed 8 to 10% of the available CPU capacity. Both of those CPU utilization figures underscores the software defined infrastructure efficiency of PowerFlex. In this test case scenario, we ran both the compute and storage occupied the same node (hyperconverged), but remember that we can also run a in 2-layer configuration using compute only and storage only nodes for asymmetrical scaling.
The systems used for these tests used the following configuration. Note that we used six drives per node in the R740xd chassis that can hold 24 drives which means there were another 18 slots available for additional drives. As noted previously, more drives mean more parallel capabilities for performance and rebuild velocity.
PowerFlex delivers cloud scale performance with unrivaled grace under pressure reliability for delivering a software defined block storage product with six nines of availability. Be sure to read Part 1 of this blog “Resiliency Explained — Understanding the PowerFlex Self-Healing, Self-Balancing Architecture” to see the other protection architecture elements not covered here. For more information on our validated mission critical workloads like Oracle RAC, SAP HANA, MySQL, MongoDB, SAS, Elastic, VDI, Casandra and other business differentiating applications, please visit our PowerFlex product site.
* FIO set for 8k, 20% write, 80% reads
Resiliency Explained — Understanding the PowerFlex Self-Healing, Self-Balancing Architecture
Wed, 15 Jul 2020 16:35:08 -0000|
Read Time: 0 minutes
My phone rang. When I picked up it was Rob*, one of my favourite PowerFlex customers who runs his company’s Storage Infrastructure. Last year, his CTO made the decision to embrace digital transformation across the entire company, which included a software-defined approach. During that process, they selected the Dell EMC PowerFlex family as their Software-Defined Storage (SDS) infrastructure because they had a mixture of virtualised and bare-metal workloads, needed a solution that could handle their unpredictable storage growth, and also one powerful enough to support their key business applications.
During testing of the PowerFlex system, I educated Rob on how we give our customers an almost endless list of significant benefits – blazingly fast block storage performance that scales linearly as new nodes are added to the system; a self-healing & self-balancing storage platform that automatically ensures that it always gives the best possible performance; super-fast rebuilds in the event of disk or node failures, plus the ability to engineer a system that will meet or exceed his business commitments to uptime & SLAs.
PowerFlex provides all this (and more) thanks to its “Secret Sauce” – its Distributed Mesh-Mirror Architecture. It ensures there are always two copies of your application data – thus ensuring availability in case of any hardware failure. Data is intelligently distributed across all the disk devices in each of the nodes within a storage pool. As more nodes are added, the overall performance increases nearly linearly, without affecting application latencies. Yet at the same time, adding more disks or nodes also makes rebuild times during those (admittedly rare) failure situations decrease – which means that PowerFlex heals itself more quickly as the system grows!
PowerFlex automatically ensures that the two copies of each block of data that gets written to the Storage Pool reside on different SDS (storage) nodes, because we need to be able to get a hold of the second copy of data if a disk or a storage node that holds the first block fails at any time. And because the data is written across all the disks in all the nodes within a Storage Pool, this allows for super-quick IO response times, because we access all data in parallel.
Data also gets written to disk using very small chunk sizes – either 1MB or 4KB, depending on the Storage Pool type. Why is this? Doing this ensures that we always spread the data evenly across all the disk devices, automatically preventing performance hot-spots from ever being an issue in the first place. So, when a volume is assigned to a host or a VM, that data is already spread efficiently across all the disks in all Storage Nodes. For example, a 4-Node PowerFlex system, with 3 volumes provisioned from it, will look something like the following:
Figure 1: A Simplified View of a 4-Node PowerFlex System Presenting 3 Storage Volumes
Now, here is where the magic begins. In the event of a drive failure, the PowerFlex rebuild process utilizes an efficient many-to-many scheme for very fast rebuilds. It uses ALL the devices in the storage pool for rebuild operations and will always rebalance the data in the pool automatically whenever new disks or nodes are added to the Storage Pool. This means that, as the system grows, performance increases linearly – which is great for future-proofing your infrastructure if you are not sure how your system will grow. But this also gives another benefit – as your system grows in size, rebuilds get faster!
Customers like Rob typically raise their eyes at that last statement – until we provide a simple example to get the point across – and then they have a lightbulb moment. Think about what happens if we used a 4-node PowerFlex system, but only had one disk drive in each storage node. All data would be spread evenly across the 4 Nodes, but we also have some spare capacity reserved, which is also spread evenly across each drive. This spare capacity is needed to rebuild data into, in the event of a disk or a node failure and it usually equates to either the capacity of an entire node or 10% of the entire system, whichever is largest. At a superficial level, a 4-Node system would look something like this:
Figure 2: A Simplified View of a 4-Node PowerFlex System & Available Dataflows
If one of those drives (or nodes) failed, then obviously we would end up rebuilding between the three remaining disks, one disk per node:
Figure 3: Our Simplified 4-Node PowerFlex System & Available Dataflows with One Failed Disk
Now of course, in this scenario, that rebuild is going to take some time to complete. We will be performing lots of 1MB or 4KB copies between the three remaining nodes, in both directions, as we rebuild into the spare capacity available on the remaining nodes & get back to having two copies of data in order to be fully protected again. It is worth pointing out here that a node typically contains 10 or 24 drives, not just one, so PowerFlex isn’t just protecting you from “a” drive failure, we’re able to protect you from a whole pile of drives. This is not your typical RAID card schema.
Now – let the magic of PowerFlex begin! What happens if we were to add a fifth storage node into the mix? And what happens when a disk or node fails in this scenario??
Figure 4: Dataflows in a Normally Running 5-Node PowerFlex System … & Available Dataflows with One Failed Disk or Node
It should be clear for all to see that we now have more disks - and nodes - to participate in the rebuild process, making the rebuild complete substantially faster than in our previous 4 node scenario. But PowerFlex nodes do not have just a single disk inside them - They typically have 10 or 24 drive slots, hence even for a small deployment with 4 nodes, each having 10 disks, we will have data placed intelligently and evenly across all 40 drives, configured as one Storage Pool. Now, with today’s flash media, that is a heck of a lot of performance capability available at your fingertips, that can be delivered with consistent sub-millisecond latencies.
Let me also highlight the “many-to-many” rebuild scheme used by each Storage Pool. This means that any data within a Storage Pool can be rebuilt to all the other disks in the same Pool. If we have 40 drives in our pool, it means that when one drive fails, the other 39 drives will be utilised to rebuild the data of the failed drive. This results in extremely quick rebuilds that occur in parallel, with minimum impact to application performance if we lose a disk:
Figure 5: A 40-disk Storage Pool, with a Disk Failure… Showing The Magic of Parallel Rebuilds
Note that we had to over-simplify the dataflows between the disks in the figure above, because if we tried to show all the interconnects at play, we would simply have a tangle of green arrows!
Here’s another example to explain the difference between PowerFlex and conventional RAID-type drive protection. The initial rebuild test on an empty system usually takes little more than a minute for the rebuild to complete. This is because PowerFlex will only ever rebuild chunks of application data, unlike a traditional RAID controller, which will rebuild disk blocks whether they contain data or not. Why waste resources rebuilding empty zeroes of data when you need to repair from a failed disk or node as quickly as possible?
The PowerFlex Distributed Mesh-Mirror architecture is truly unique and gives our customers the fastest, most scalable and most resilient block storage platform available on the market today! Please visit www.DellTechnologies.com/PowerFlex for more information.
* Name changed to protect the innocent!
PowerFlex and CloudLink: A Powerful Data Security Combination
Wed, 08 Jul 2020 14:06:22 -0000|
Read Time: 0 minutes
Security and operational efficiency continue to top IT executives’ datacenter needs lists. Dell Technologies looks at the complete solution to achieve both so customers can focus on their business outcomes.
Dell Technologies’ PowerFlex is a software-defined storage platform designed to significantly reduce operational and infrastructure complexity, empowering organizations to move faster by delivering flexibility, elasticity, and simplicity with predictable performance and resiliency at scale. PowerFlex provides a unified fabric of compute and storage with scale out flexibility for either of these ingredients to match workload requirements with full lifecycle simplification provided by PowerFlex Manager. Dell Technologies’ CloudLink, data encryption and key management solution, supports workload deployments from edge to core to cloud, providing a perfect complement to the PowerFlex family that enables flexible encryption tailored to the modern datacenter’s needs.
With increasing regulatory and compliance requirements, more and more customers now realize how critical encryption is to securing their data centers and need solutions that are built into their platforms. CloudLink, integrated with PowerFlex, provides reliable data encryption and key management in one solution with the flexibility to satisfy most customer's needs.
CloudLink’s rich feature set integrates directly into the PowerFlex platform allowing our customers access to CloudLink's encryption and key management functionality, including data at rest and data in motion encryption, full key lifecycle management, and lightweight multi-tenancy support.
CloudLink provides software-based data encryption and a full set of key management capabilities for PowerFlex, including:
SEDs offer high performant hardware-based Data-at-Rest Encryption ensuring that all data in the deployment is safe from prying eyes. On a PowerFlex platform, CloudLink can manage the keys for each individual drive and store them safely within our encrypted vault where customers can leverage CloudLink's full key lifecycle management feature set. This option, also integrated and deployable with PowerFlex Manager, is ideal for your sensitive data assets that require high-performance.
Sometimes Data-at-Rest Encryption is not enough, and our customers need to encrypt their virtual machines. CloudLink provides VM encryption by deploying agents on the guest OS. CloudLink's agent encryption gives our customers the ability to move encrypted VMs throughout their environment making tasks such as replication, deployment to production from QA, or out to satellite offices, safer and easier.
CloudLink’s encryption for machines agent can also encrypt data volumes on bare metal servers allowing customers to keep their data safe even when deployed on legacy hardware.
When 3rd party encryptors need external key management, they turn to solutions that implement KMIP (Key Management Interoperability Protocol). This open standard defines how encryptors and key managers communicate. CloudLink implements the KMIP protocol both as a client and a server to provide basic key storage and management for encryptors such as VMware’s native encryption features, or to plug-in to a customer’s existing keystore. These capabilities provide the flexibility required for today’s heterogenous environments.
There is a sea change occurring in data centers brought on by the relatively new technology of containers. 451 Research, a global research and advisory firm, released the results of its 2020 Voice of the Enterprise survey, which indicates that as companies consider the move to containerized deployments, security and compliance concerns are top of mind. However, for so many of the new container technology products from which to choose, proper security is not built-in.
Given the extreme mobility of containers, keeping customers’ data safe as applications move throughout a deployment – especially within the cloud – is a challenge. To address this gap, we introduced file volume encryption for Kubernetes container deployments in our CloudLink 7.0 release, which has been validated with PowerFlex 3.5. Our container encryption functionality is built on the same full lifecycle key management and agent-based encryption architectural model that we currently offer for PowerFlex. We deploy an agent within the container such that it sits directly on the data path. As the data is saved, we intercept it and make sure it is encrypted as it travels to and then comes to rest in the data store.
Hand in hand with PowerFlex, CloudLink provides data encryption and key management with unmatched flexibility, superior reliability, and simple and efficient operations complete with support from Dell as a complete solution. The PowerFlex Manager is a comprehensive IT operations and lifecycle management tool that drastically simplifies management and ongoing operation. CloudLink is integrated into this tool to make the deployment of the CloudLink agent a natural part the PowerFlex management framework.
PowerFlex: The advantages of disaggregated infrastructure deployments
Mon, 29 Jun 2020 18:57:26 -0000|
Read Time: 0 minutes
For several years, there has been a big push from quite a number of IT vendors towards delivering solutions based on Hyperconverged Infrastructure or HCI. The general concept of HCI is to take the three primary components of IT, compute, network and storage, and deliver them in a software defined format within a building block, normally an x86 based server. These building blocks are then joined together to create a larger, more resilient environment. The software defined components are typically a hypervisor to provide compute, virtual adapters and switches for networking, along with some software that takes the local disks attached to the server, combines them with the disks directly attached to the other building blocks and presents them as a virtual storage system back to the environment.
The HCI approach is attractive to customers for a variety of reasons:
There are of course scenarios where the HCI model does not fit, the limitations are frequently associated with the software defined storage part of the environment, situations such as the following:
Several HCI vendors have attempted to address these points but often their solutions to the issues involve a compromise.
What if there was a solution that provided software defined storage that was flexible enough to meet these requirements without compromise?
Step forward PowerFlex, a product flexible enough to be deployed as an HCI architecture, a disaggregated architecture (separate compute and storage layers managed within the same fabric), or a mixture of the two.
So how can PowerFlex be this flexible?
It is all about how the product was initially designed and developed, it consists predominantly of three separate software components:
Each of these components can be installed across a cluster of servers in a variety of ways in order to create flexible deployment scenarios. The SDC and SDS components communicate with one another over a standard TCP/IP network to form an intelligent fabric, this is all overseen by the MDM, which is not in the data path.
Some pictures will help illustrate this far better than I can with words.
By installing the SDC (the C in a yellow box) and the SDS (the S in a green box) on to the same server, an HCI environment is created.
If the SDC and SDS are installed on dedicated servers, a disaggregated infrastructure is created
And because PowerFlex is entirely flexible (the clue is in the name), HCI and disaggregated architectures can be mixed within the same environment.
What are the advantages of deploying a disaggregated environment?
Whilst HCI deployments are ideal for environments where compute requirements and storage capacity increases remain in lockstep, there are many use cases where compute and storage needs grow independently, PowerFlex is capable of serving both requirements.
PowerFlex was built to allow this disaggregation of resources from day one, which means that there is no downside to performance or capacity when storage nodes are added to existing clusters, in fact there are only positives, with increased performance, capacity and resilience, setting PowerFlex apart from many other software defined storage products.
Dell EMC PowerFlex and VMware Cloud Foundation for High Performance Applications
Thu, 25 Jun 2020 13:10:33 -0000|
Read Time: 0 minutes
The world in 2020 has shown all industries that innovation is necessary to thrive in all conditions. VMware Cloud Foundation (VCF) hybrid cloud platform was crafted by innovators who realize the biggest asset our customers have is their information technology and the data that runs the business. The VCF offering takes the complexity out of operationalizing infrastructure to enable greater elasticity, growth, and simplification through improved automation. VCF enables options available using on-premises and multi-cloud deployments to address ever changing enterprise needs.
VMware included design factors that anticipated customers’ use of varying storage options in the flexibility of implementing VCF. VMware vSAN is the standard for VCF hyperconverged infrastructure (HCI) deployments and is directly integrated into vSphere and VCF. For those circumstances where workloads or customer resource usage require alternative storage methods, VMware built flexibility into the VCF storage offering. Just as we see a wide variety in desktop computing devices, one size doesn't fit all applies to the enterprise storage products as well. Dell Technologies’ PowerFlex (formerly VxFlex) provides a software-defined mechanism to add a combination of compute and storage with scale out flexibility. As customers look to software-defined operational constructs for agility, PowerFlex provides an adjustable means to add the right balance of storage resources while enabling non-disruptive additions without painful migrations as demands increase.
Joining the Dell Technologies Cloud family as a validated design, Dell EMC PowerFlex helps customers simplify their path to hybrid cloud by combining the power of Dell EMC infrastructure with VMware Cloud Foundation software as supplemental storage. As a high-performance, scale out, software-defined block storage product, PowerFlex provides a combination of storage and compute in a unified fabric that's well equipped to service particularly challenging workloads. The scalability of compute and/or storage in a modular architecture provides an asymmetrical (2-layer) option to add capacity to either compute or storage independently. PowerFlex makes it possible to transform from a traditional three-tier architecture to a modern data center without any trade-offs between performance, resilience or future expansion.
PowerFlex significantly reduces operational and infrastructure complexity, empowering organizations to move faster by delivering flexibility, elasticity, and simplicity with predictable performance and resiliency at scale for deployments. PowerFlex Manager is a key element of our engineered systems providing a full lifecycle administration experience for PowerFlex from day 0 through expansions and upgrades which is independent, but complementary to the full stack life cycle management available through VCF via SDDC Manager. A cornerstone value proposition of VCF is administering the lifecycle management of OS upgrades, vSphere updates, vRealize monitoring, automation and NSX administration. PowerFlex manager works in parallel with VCF to deliver a comprehensive lifecycle experience for the physical ingredients and for the PowerFlex software-define storage layer. PowerFlex also offers a vRealize Operations plug-in for a unified monitoring capability from VMware vRealize Suite which is included in most VCF editions. From a storage management perspective, PowerFlex utilizes a management system that complements VCF and VMware vSphere by working within the appropriate vCenter management constructs. PowerFlex Manager provides the administration of PowerFlex storage functions, while VCF and vCenter manages the allocation of LUNs to provisioned VMFS file systems to provide data stores for the provisioned workloads.
PowerFlex systems enables customers to scale from a small environment to enterprise scale with over a thousand nodes. In addition, it provides enterprise grade data protection, multi-tenant capabilities, and add-on enterprise features such as QoS, thin provisioning, compression and snapshots. PowerFlex systems deliver the performance and time-to-value required to meet the demands of the modern enterprise data center.
Does Supplemental Storage Mean Slow or Light Workload Use Cases?
PowerFlex provides a Dell Technologies validated design as a supplemental storage platform for VCF, unlocking the value of PowerFlex to be realized by customers within the VCF environment. By providing sub-millisecond latency, high IOPS and high throughput with linearity as nodes join the fabric, the result is a very predictable scaling profile that accelerates the VCF vision within the datacenter.
PowerFlex, as a part of VCF, can help solve for even the most demanding of applications. Using the supplemental capabilities to service workloads with the highest of efficiency provides a best of class performance experience. Some illustrative examples of demanding application workloads validated with PowerFlex, independent of VCF, include the following:
SAP HANA certified for PowerFlex integrated rack in both 4-socket and 2-socket offerings (certification details). Highly efficient in hosting up to six production HANA instances per 4-socket server. Our capabilities outperform external competitors by hosting 2x the capacity. The Configuration and Deployment Best Practices for SAP HANA white paper provides details. While this white paper illustrates a single layer architecture, even better performance characteristics are achievable using the VCF aligned 2-layer architectural implementation of PowerFlex.
Oracle RAC & Microsoft SQL
Flexibility to run compute and storage on separate hardware results in significant reduction of database licensing cost.
Validated/certified by SAS for running SAS mixed analytics workloads (white paper) providing an average throughput of 210 MBs per core (40% greater than their recommended 150 MB/sec needed for certification).
The validated solution (white paper) with Elastic provides customers with the required high-performance, scalable, block-based IO with flexible deployment options in multiple operating environments (Windows, Linux, Virtualized/Bare Metal). Elastic validated the efficiency of PowerFlex using only three compute and 4 storage nodes to deliver ~1 billion indexing events measured by Elastic’s Rally benchmarking tool.
The validated PowerFlex solution for Epic delivers 6x9’s availability and high performance for critical the EPIC hyperspace workloads while simultaneously enabling hosting the VDI with the operational and analytical databases for a completely integrated infrastructure option.
For customers deploying Kubernetes container-based database deployments like Cassandra, PowerFlex provides 300,000 operations/second for 10 million operations (Read intensive operations) with avg read latency of 1ms on just eight nodes.
PowerFlex gives Dell Technologies the ability to help customers address diverse infrastructure needs. For more information on all of the Dell Technologies storage options with Cloud Validated Designs for VMware Cloud Foundation, please view our white paper. The implementation guide for using PowerFlex for supplemental storage provides the simple steps to provide complementary storage options for VCF deployments. For more information on the PowerFlex product family and workload solutions, please see the product page here. The PowerFlex White Paper - Technical Overview also provides a comprehensive perspective how organizations can begin changing the way they think about a modern data center architecture. Please contact your local Dell sales representative for more information.
Other pre-tested Dell Technologies Storage products validated for VMware Cloud Foundation that provide the capabilities to independently scale storage and compute include the offerings below. You can find more details in the Dell Technologies Cloud Validated Designs document.