A Case for Repatriating High-value Workloads with PowerFlex Software-Defined Storage
Tue, 20 Oct 2020 12:16:04 -0000
|Read Time: 0 minutes
Kent Stevens, Product Management, PowerFlex
Brian Dean, Senior Principal Engineer, TME, PowerFlex
Michael Richtberg, Chief Strategy Architect, PowerFlex
We observe customers repatriating key applications from the Cloud, help you think about where to run your key applications, and explain how PowerFlex’s unique architecture meets the demands of these workloads in running and transforming your business
For critical software applications you depend upon to power core business and operational processes, moving to “The Cloud” might seem the easiest way to gain the agility to transform the surrounding business processes. Yet we see many of our customers making the move back home, back “On-Prem” for these performance-sensitive critical workloads – or resisting the urge to move to The Cloud in the first place. PowerFlex is proving to deliver agility and ease of operations for the IT infrastructure for high-value, large-scale workloads and data-center consolidation, along with a predictable cost profile – as a Cloud-like environment enabling you to reach your business objectives safely within your own data center or at co-lo facilities.
IDC recently found that 80% of their customers had repatriation activities, and 50% of public-cloud based applications were targeted to move to hosted-private cloud or on-premises locations within two years(1). IDC notes that the main drivers for repatriation are security, performance, cost, and control. Findings reported by 451 Research(2) show cost and performance as the top disadvantages when comparing on-premises storage to cloud storage services. We’ve further observed that core business-critical applications are a significant part of these migration activities.
If you’ve heard the term “data gravity,” which relates to the difficulty in moving data to and from the cloud and that may only be part of the problem. “Application” gravity is likely a bigger problem for performance sensitive workloads that struggle to achieve the required business results because of scale and performance limitations of cloud storage services.
Transformation is the savior of your business – but a problem for your key business applications
Business transformation impacts the data-processing infrastructure in important ways: Applications that were stable and seldom touched are now the subject of massive changes on an ongoing basis. Revamped and intelligent business processes require new pieces of data, increasing the storage requirements and those smarts (the newly automated or augmented decision-making) require constant tuning and adjustments. This is not what you want for applications that power your most important business workflows that generate your profitability. You need maximum control and full purview over this environment to avoid unexpected disruptions. It’s a well-known dilemma that you must change the tires while the car is driving down the road – and today’s transformation projects can take this to the extreme.
The infrastructure used to host such high-profile applications – computing, storage and networking – must be operated at scale yet still be ready to grow and evolve. It must be resilient, remain available when hardware fails, and be able to transform without interruption to the business.
Does the public cloud deliver the results you expected?
Do your applications require certain minimum amounts of throughput? Are there latency thresholds you consider critical? Do you require large data capacities and the ability to scale as demands grow? Do require certain levels of availability? You may assume all these requirements come with a “storage” product offered by the public cloud platforms, but most fall short of meeting these needs. Some require over-provisioning to get better performance. High availability options may be lacking. The highest performing options have capacity scale limitations and can be prohibitively expensive. If you assume what you’ve been using on-prem comes from a hyperscaler, you may be quite surprised that there are substantial gaps that require expensive application rearchitecting to be “cloud native” which may become budget busters. These public cloud attributes can lead to “application gravity” gaps.
While the agility of it is tempting, the unexpected costliness of moving everything to the public cloud has turned back more than one company. When evaluating the economics and business justification for Cloud solutions, many costs associated with full-scale operations, spikes in demand or extended services can be hard to estimate, and can turn out to be large and unpredictable.
The full price of cloud adoption must account for the required levels of resiliency, management infrastructure, storage and analytics for operational data, security solutions, and scaling up the resources to realistic production levels. Recognizing all the necessary services and scale may undermine what might have initially appeared to be a solid cost justification. Once the budget is established, active effort and attention must be devoted to monitoring and oversight. Adapting to unexpected operational events, such as bursting or autoscaling for temporary spikes in workload or traffic, can bring unforeseen leaps in the monthly bill. Such situations can be especially hard to predict and plan for – and very difficult to control.
You want the speed, convenience and elasticity of running in the cloud - but how do you ensure that agility while staying within the necessary bounds of cost and oversight? Truly transformative infrastructure allows businesses to consolidate compute and storage for disparate workloads onto a single unified infrastructure to simplify their environment, increase agility, improve resiliency and lower operational costs. And your potential payoff is big with far easier scaling, more efficient hardware utilization, and less time spent figuring out how to get things right or tracking down issues that complicate disparate system architectures.
Software-Defined is the Future
IDC Predicts that by 2024, software-defined infrastructure solutions will account for 30% of storage solutions(3). At the heart of the PowerFlex family, and the enabler of its flexibility, scale and performance is PowerFlex software-defined storage. The ease and reliability of deployment and operation is provided by PowerFlex Manager, an IT operations and lifecycle management tool for full visibility and control over the PowerFlex infrastructure solutions.
PowerFlex’s unmatched combination of flexibility, elasticity, and simplicity with predictable high performance - at any scale - makes it ideally suited to be the common infrastructure for any company. Utilizing software defined storage (SDS) and hosting multiple heterogeneous computing environments, PowerFlex enables growth, consolidation, and change with cloud-like elasticity – without barriers that could impede your business.
The resulting unique architecture of the PowerFlex family easily meets the large-scale, always-on requirements of our customers’ core enterprise applications. The power and resiliency of the PowerFlex infrastructure platforms handle everything from high-performance enterprise databases, to web-scale transaction processing, to demanding business solutions in various industries including healthcare, utilities and energy. And this includes the new big-data and analytical workloads that are quickly augmenting the core applications as the business processes are being transformed.
PowerFlex: A Unique Platform for Operating and Transforming Critical Applications
PowerFlex provides the flexibility to utilize your choice of tools and solutions to drive your transformation and consolidation, while controlling the costs of the relentless expansion in data processing. PowerFlex provides the modularity to adapt and grow efficiently while providing the manageability to simplify your operations and reduce costs. It provides the scalable infrastructure on-premises to allow you focus on your business operations. PowerFlex on-demand options by the end of 2020 enable an elastic OPEX consumption model as well.
As your business needs change, PowerFlex provides a non-disruptive path of adaptability. As you need more compute, storage or application workloads, PowerFlex modularly expands without complex data migration services. As your application infrastructure needs change from virtualization to containers and bare metal, PowerFlex can mix and match these in any combination necessary without needing physical changes or cluster segmentation. PowerFlex provides future-proof capabilities that keep up with your demands with six nines of availability and linear scalability.
With the dynamic new pace of growth and change, PowerFlex can ensure you stay in charge while enabling the agility to adapt efficiently. PowerFlex enables you to leverage the advantages of oversight and cost-effectiveness of the on-premises environment with the ability to meet transformation head-on.
For more information, see PowerFlex on Dell EMC.com, or reach out to Contact Us.
footnotes:
1 IDC Cloud Repatriation Accelerates in a Multi-Cloud World, July 2018
2 451 Research, 2020 Voice of the Enterprise
3 IDC FutureScape: Worldwide Enterprise Infrastructure 2020 Predictions, October 2019
Related Blog Posts
PowerFlex Summer 2021 Updates Deliver on Execution, Compliance, and Confidence
Tue, 04 Jul 2023 09:48:51 -0000
|Read Time: 0 minutes
Execute Flawlessly – Comply Effortlessly – Be Confident
The summer 2021 release of Dell EMC PowerFlex Software-defined Infrastructure extends the PowerFlex family’s transformational superpowers, providing businesses with the agility to thrive in ever-changing economic and technological landscapes. The release of PowerFlex 3.6 and PowerFlex Manager 3.7 enables customers to supercharge their mission-critical workloads with enhanced automation and platform options. It safeguards workload execution with expanded continuity and compliance offerings. And businesses running PowerFlex can be confident in predictable outcomes at scale with new infrastructure insights, network resiliency enhancements, and integrated upgrade guidance.
Keep an eye on the important stuff
A highlight of this release is PowerFlex integration with Dell EMC CloudIQ, a cloud-based application that intelligently and proactively monitors the health of Dell EMC storage, data protection, HCI and other systems. Users can enjoy a single UI for multi-system, multi-site PowerFlex monitoring that includes system health, configuration/inventory, capacity usage, and performance. The PowerFlex system must be first connected to Dell EMC Secure Remote Services (SRS), and then CloudIQ is automatically enabled. Health scores are based on health check algorithms that use capacity, performance, configuration, components, and data protection criteria whose value is informed by PowerFlex alert data. Users can opt in to get health notifications via email or mobile phones, and the history of generated and cleared health issues is maintained for two years. After ingesting a couple of weeks of data, CloudIQ machine learning will begin looking for and noting IOPS and bandwidth anomalies. It also watches for and signals latency performance impacts.
For information on adding your PowerFlex system(s) to CloudIQ see the Knowledge Base article. And to get a hands-on look at PowerFlex in CloudIQ, check out the online Simulator (log in with your support account) and see technical white papers and demo recordings on www.delltechnologies.com/cloudiq.
Be safe with your data out there
PowerFlex native asynchronous replication was introduced last year with version 3.5. Now, in PowerFlex 3.6, we have made it even more flexible and improved compliance targets. We cut the minimum RPO in half and now support RPOs as low as 15 seconds. We also added tooling to improve control over Replication Consistency Groups (RCGs) – sets of PowerFlex volumes being replicated together. RCGs can now be active or inactive, where inactive RCGs hold their configuration but use no additional system resources. The ability to terminate an RCG and leave it in an inactive state also improves the recovery process if a user runs out of journal capacity.
With this release, PowerFlex supports replication in VMware HCI environments. In this scenario, PowerFlex Manager 3.7 (and above) orchestrates resizing the Storage Virtual Machines (SVMs) and the addition of the Storage Data Replicators (SDRs) into the system. Because the orchestration is done by PowerFlex Manager, the option to replicate between PowerFlex HCI deployments running VMware is limited to appliance and rack deployments.
Systems running 3.5.x can be active replication peers with systems running 3.6, and the source and destination systems can be on different code versions long term. For further information about PowerFlex replication architecture, limitations and design considerations, see the Dell EMC PowerFlex: Introduction to Replication white paper.
Along with these internal replication improvements, we are introducing integration with VMware Site Recovery Manager (SRM) – disaster recovery management and automation software for virtual machines and their workloads. The PowerFlex Storage Replication Adapter (SRA) enables PowerFlex as the native replication engine for protecting VMs on vSphere datastores. The PowerFlex SRA is compatible with SRM 8.2 or 8.3, the Photon OS-based SRM appliances. And while we are introducing this with the current releases, the SRA is compatible with PowerFlex systems running 3.5.1.x and above. Users can create recovery plans to failover VMs to another site, fail back to the original, or use PowerFlex’s non-disruptive replication failover testing to run failover tests in SRM.
The SRA and installation instructions are available for download from the VMware website. For detailed information about the SRA implementation and usage examples, see the whitepaper on Disaster Recovery for Virtualized Workloads Dell EMC PowerFlex using VMware Site Recovery Manager.
The following figure shows an architecture overview of PowerFlex SRA and VMware SRM:
PowerFlex native replication, and the integration with VMware Site Recovery Manager, provide robust, crash-consistent data protection for disaster recovery and business continuity. But we are also introducing integration with Dell EMC AppSync for application-consistent copy lifecycle management. For customers using the wide range of supported databases and filesystems, AppSync v4.3 adds support for PowerFlex, seamlessly bringing PowerFlex’s superpowers into AppSync’s simplified copy data management. AppSync has deep integrations with Oracle, Microsoft SQL Server, Microsoft Exchange, and SAP HANA, and it enables VM-consistent copies of data stores and individual VM recovery for VMware environments. But it can also support other enterprise applications – like EPIC Cache, DB2, MySQL, etc. – through file system copies.
AppSync with PowerFlex integration will be available mid-July 2021. For information and examples, see the Dell EMC PowerFlex and AppSync integration video.
One more note on security. PowerFlex rack and appliance are now FIPS 140-2 compliant for data at rest and key management. Hardware based data at rest encryption is achieved using supported self-encrypting drives (SEDs), with the encryption engine running on the SEDs to deliver better performance and security. The SEDs based encryption claim is based on FIPS 140-2 Level 2 certification. Dell EMC CloudLink, the KMIP and FIPS 140-2 Level 1 (CloudLink Agent and CloudLink Server) compliant key manager, is used to manage SEDs encryption keys.
Automate (and containerize) all the things
PowerFlex software-defined infrastructure is eminently suited to cloud-native use cases and automatable workflows. There has been a lot of recent progress in PowerFlex’s support for these ecosystems. The Container Storage Interface (CSI) driver for PowerFlex continues to evolve, with support for accessing multiple PowerFlex clusters, ephemeral inline volumes, and importantly a containerized PowerFlex Storage Data Client (SDC) deployment and management. The containerized SDC allows CSI to inject the PowerFlex volume driver into the kernel of container-optimized operating systems that lack package managers. This provides PowerFlex CSI support for Red Hat CoreOS and Fedora Core OS. And it also enables integration of PowerFlex with RedHat OpenShift 4.6 and greater. The forthcoming CSI version 1.5 adds support for volume consistency groups and custom file system format options. Users can set specific disk format command parameters when provisioning a volume. Star and watch the GitHub Repository for the PowerFlex CSI Driver for updates.
In addition to this, Dell Technologies has been developing a set of Container Storage Modules (CSM) that complement the CSI drivers. PowerFlex is at the forefront of that effort, and there are several modules available for tech preview, with general availability coming later this year.
- Observability CSM: Provides exportable telemetry metrics for I/O performance & storage usage, for consumption in tools like Grafana and Prometheus. Bridges the observability gap between Kubernetes and PowerFlex storage admins.
- Authorization CSM: Provides a set of RBAC tools for PowerFlex and Kubernetes. This is an out-of-band tool proxying admin credentials and enabling the management of storage consumers and their limits (e.g., tenant segmentation, storage quota limits, isolation, auditing, etc.).
- Resiliency CSM: Provides stateful application fault protection & detection, resiliency for node failure and network disruptions. Reschedules failed pods on new resources and asks the CSI driver to un-map and re-map the persistent storage volumes to the online nodes.
Users can automate volume and snapshot lifecycle management with the PowerFlex Ansible Modules. They can also use the modules to gather facts about their PowerFlex systems and manage various storage pool and SDC details. The Ansible modules are available on GitHub and Ansible Galaxy. They work with Ansible 2.9 or later and require the PowerFlex Python SDK (which may also be used by itself to facilitate authentication to and interaction with a PowerFlex cluster). Again, watch the repositories for additional modules and expansions in the near future.
All these automation tools leverage and rely upon the PowerFlex REST API. And Dell Technologies has introduced a new Developer Portal, where the APIs for many products can be explored. The PowerFlex API, along with explanations and usage examples, can be found at https://developer.dell.com/apis/4008/versions/3.6/docs.
Always keep on improving
With every release, PowerFlex and PowerFlex Manager get faster, more secure, and more easily manageable. In PowerFlex 3.6 there are a number of UI enhancements, including simplification of menus, better capacity reporting around data reduction, a new dedicated area for snapshots and snapshot policy management, and – following on Dell Technologies’ drive towards more inclusive language – a change in the labels for the MDM cluster roles. “Master” and “Slave” roles are now “Primary” and “Secondary”.
PowerFlex 3.6 introduces support for Oracle Linux Virtualization (KVM based), which adds a supported hypervisor layer to the previous support for Oracle Enterprise Linux. This advances the numerous Oracle database deployments on PowerFlex, providing improved Oracle supportability while still offering the great cost-effectiveness PowerFlex offers for running Oracle. For detailed information on installing and configuring, please refer to the Oracle Linux KVM on PowerFlex white paper.
In the software-defined storage layer itself, version 3.6 doubles the number of Storage Data Clients (the consumers of PowerFlex volumes) per system to 2048. This doubles the number of hosts that can map volumes from PowerFlex storage pools. The software is also smarter when it comes to detecting and handling network error cases. In some disaggregated, or two-layer, systems where the SDCs live on a separate network than the storage cluster itself, a network path impairment between an SDC and a single Storage Data Server (SDS) node can cause I/O failures – even when there isn’t a general network failure in the cluster. In version 3.6 if such a disruption occurs, the SDC can use another SDS in the system to proxy the I/O to its original destination. Users are alerted until the problem is cleared, but I/O continues uninterrupted.
Because of the highly distributed architecture of PowerFlex, ports or sockets experiencing frequent disconnects (flapping), can cause overall system performance issues. 3.6 detects this and proactively disqualifies the path, preventing general disruption across the system.
In version 3.5, we introduced Protected Maintenance Mode (PMM), a super-safe way to put a node into maintenance while nevertheless avoiding a lengthy data-rehydration process at the end. Now, PMM makes use of the highly parallel many-to-many rebalancing algorithm, as a node goes into maintenance. Depending on the amount of data stored on the node, this can still be a long process, and other things can change in the system as it’s happening. PowerFlex 3.6 adds an auto-abort feature, in which the system continually scans for hardware or capacity issues that would prevent the node from completely entering PMM. If any flags are raised, the system will abort the process and notify the user. More information on maintenance modes, and the new PMM auto-abort feature, can be found in this whitepaper.
PowerFlex Manager 3.7 has gotten much more powerful as well. Foremost among the improvements is a new Compatibility Management feature. This new feature helps customers automatically identify the recommended upgrade paths for both the PowerFlex Manager appliance itself and the system RCM/IC upgrade. Prior to this release, whenever a customer or Dell Professional Services wished to do an upgrade, it took a lot of effort and time to manually investigate the documentation and compatibility matrixes to understand all of the upgrade rules – what are the allowed upgrade paths, which PowerFlex Manager version works with which RCM/IC versions, etc.
The new Compatibility Management tools eliminate the work and assist users by automatically identifying recommended upgrade paths. To determine which paths are valid and which are not, PowerFlex Manager uses information that is provided in a compatibility matrix file. The compatibility matrix file maps all the known valid and invalid paths for all previous releases of the software. It breaks the possible upgrade paths down as:
- Recommended: tested or implied as tested
- Supported: allowed, but not necessarily tested
- Not Allowed: unsupported update path
PowerFlex Manager 3.7 also introduces support for vSphere 7.0 U2. Upgrading to this version requires a manual vCenter upgrade. But then PowerFlex Manager will take over and manage the ESXi clusters. PowerFlex Manager 3.7 supports VMware ESXi 7.0 Update 2 installation, upgrade, and expansion operations for both hyperconverged and compute-only services. Users can deploy new services, add existing services running VMware ESXi 7.0 U2, or expand existing services. PowerFlex Manager also supports upgrades of VMware ESXi clusters in hyperconverged or compute-only services. You can upgrade VMware ESXi clusters from version 6.5, 6.7, or 7.0 to VMware ESXi 7.0 Update 2.
When you deploy a new ESXi 7.0U2 service, PowerFlex Manager automatically deploys two service volumes and maps these volumes to two heartbeat datastores on shared storage. PowerFlex Manager also deploys three vSphere Cluster Services (vCLS) VMs for the cluster.
PowerFlex Manager introduces several other enhancements in this release. It now supports 32k volumes per Service, aligned with PowerFlex core software volume scalability. It has enhanced security for SMB/NFS. A user-specific account is now required to gain access to the SMB share. PowerFlex Manager also updates the NFS share configuration when a user upgrades or restores the virtual appliance. PowerFlex Manager has disabled support for the SMBv1 protocol. PowerFlex Manager now uses SMBv2 or SMBv3 to enhance security.
It has also expanded its management capabilities over the PowerFlex Presentation Server and Gateway services. Prior to this release, PowerFlex Manager could deploy a Presentation Server (which hosts the WebUI) but not upgrade it. Now, PowerFlex Manager 3.7 can both discover existing instances and upgrade Presentation Servers. Similarly, it has gained the ability to upgrade the OS for the Gateway (which hosts the REST API). Prior to this release, PowerFlex Manager could only upgrade the Gateway RPM package without upgrading and patching the OS of the Gateway. Now PowerFlex Manager 3.7 can do both.
But it’s not all about software
This release adds support for a broader array of NVIDIA GPUs. Next-gen NVIDIA acceleration cards are now available for customers looking to run specialized, high-performance computing and analytics applications - Quadro RTX 6000, Quadro RTX 8000, A40, and A100. And we also introduce a small form factor GPU that can be used in the 1U R640-based PowerFlex Nodes – the NVIDIA A100. The past year demonstrated the importance supporting remote workers with virtual desktops, and PowerFlex supports GPU implementations on Citrix and VMware VDI.
We now support the Dell PowerSwitch S5296F-ON for the PowerFlex appliance. The S5295 has 96x 10/25G SFP28 ports and 8x 100G QSFP28 ports. It can support high node counts in a single cabinet, if the high oversubscription ratio is acceptable. We also introduce support for the Cisco Nexus 93180YC-FX, for use as either an access or an aggregation switch, and the Cisco 9364C-GX, for use as either a leaf or a spine switch, with 64x 100G ports.
Virtualized network infrastructure continues to grow in capability and deployment share. NSX-T™ is VMware's software-defined-network infrastructure that addresses cross-cloud needs for VM-centric network services and security. The PowerFlex appliance now joins the PowerFlex rack, in supporting NSX-T Ready configurations. “NSX-T Ready” means that the hardware configuration meets NSX-T requirements. The customer will provide NSX-T software and deploy with assistance from VMware or Dell Professional Services. The enabling components are:
- A 4-node PowerFlex management cluster, available to host the NSX-T controller VMs
- Appliance-specific NSX-T edge nodes (need 2 to 8 for running the NSX-T edge services)
- High-level NSX-T topologies and considerations available in the PowerFlex Appliance Network Planning Guide
- PowerFlex Manager will run the NSX-T edge nodes in Lifecyle Mode
As with the PowerFlex rack, appliance NSX-T edge nodes are “service appliances” that are dedicated to run network services, while the newly available HA appliance management nodes run the NSX-T management VMs. PowerFlex Manager can assist in deploying the edge nodes and will lifecycle the hardware aspects.
Wrap it up
Thanks for taking time to read about what’s new with Dell EMC PowerFlex software-defined infrastructure. We haven’t even been able to cover all the great new things being introduced this summer. Supercharge your mission-critical workloads flawlessly with enhanced automation, effortlessly enable business continuity and compliance, and confidently manage your data center operations at scale. To continue exploring, visit us on the Dell Technologies website for PowerFlex.
Grace Under Pressure — PowerFlex Rebuild Superpowers
Wed, 27 Jan 2021 12:30:14 -0000
|Read Time: 0 minutes
The first blog in this series, “Resiliency Explained — Understanding PowerFlex's Self-Healing, Self-Balancing Architecture,” covered an overview of how the PowerFlex system architecture provides superior performance and reliability. Today, we’ll take you through another level of detail with specific examples of recoverability.
Warning: Information covered in this blog may leave you wanting for similar results from other vendors.
PowerFlex possess some incredible superpowers that deliver performance results that run some of the world’s most demanding applications. But what happens when you experience an unexpected failure like losing a drive, a node, or even a rack of servers? Even planned outages for maintenance can result in vulnerabilities or degraded performance levels, IF you use conventional data protection architectures like RAID.
Just a reminder, PowerFlex is a high-performance software defined storage system that delivers the compute and storage system in a unified fabric with the elasticity to scale either compute, storage or both to fit the workload. PowerFlex uses all-flash direct attached media located on standard x86 servers utilizing industry standard HBA adapters and 10 Gb/s or higher ethernet NICs that interconnect servers. The systems scale from 4 nodes to multi-rack 1000+ nodes while increasing capacity, linearly increasing IOPS, all while sustaining sub-millisecond latency.
Powerful Protection
PowerFlex takes care of dynamic data placement that ensures there are NO hot spots, so QoS is a fundamental design point and not an after-thought bolt-on “fix” for a poor data architecture scheme; there’s no data locality needed. PowerFlex handles the location of data to ensure there are no single points of failure, and it dynamically re-distributes blocks of data if you lose a drive, add a node, take a node off line, or have a server outage (planned or unplanned) containing a large number of drives. It automatically load balances the placement of data as storage use changes over time or with node expansion.
The patented software architecture underlying PowerFlex doesn’t use a conventional RAID protection mechanism. RAID serves a purpose, and even options like erasure coding have their place in data protection. What’s missing in these options? Let’s use a couple of analogies to compare traditional RAID and PowerFlex protection mechanisms:
RAID
Think of RAID as a multi-cup layout where you’re looking to ensure each write places data in multiple cups. If you lose a cup, you don’t necessarily re-arrange the data. You’re protected from data loss, but without the re-distribution, you’re still operating in a deprecated state and potentially vulnerable to additional failures until the hardware replacement occurs. If you want more than one level of cup failure, you have multiple writes to get multiple cups which creates more overhead (particularly in a software-defined storage versus a hardware RAID controller-based system). It still only takes care of data protection and not necessarily performance recovery.
PowerFlex
Think of the architectural layout of data like a three-dimensional checkerboard where we ensure the data placement keeps your data safe. In the checkerboard layout, we can quickly re-arrange the checkers if you lose a box on the board or a row/column or even a complete board of checkers. Re-arranging the data to ensure there’s always two copies of the data for on-going protection and restoration of performance. The three-dimensional aspect comes from all nodes and all drives participating in the re-balancing process. The metadata management system seamlessly orchestrates re-distribution and balancing data placement.
Whether the system has a planned or unplanned outage or a node upgrade or replacement, this automatic rebalancing happens rapidly because every drive in the pool participates. The more nodes and the more drives, the faster the process of reconstituting any data rebuilding processes. In the software defined PowerFlex solution there’s no worrying about a RAID level or the performance trade-offs, it’s just taken care of for you seamlessly in the background without any of the annoying complications RAID often introduces or the need any specialized hardware controllers and associated cost.
Results
Drive Rebuild
PowerFlex looks at actual data stored on each drive rather than treating the whole drive capacity as what needs recovering. In this example, a drive failure occurs. The data levels illustrated here represent the total used capacity in these 6, 9 or 12 node configuration examples (we can scale to over 1,000 nodes). The 25%, 50% and 75% levels show relative rebuild times for this 960GB SAS SSD to return to restore the data to a full heathy state (re-protected).
We’re showing you a rebuild scenario to emphasize the performance, but taking it to another level, you wouldn’t be urgently needing to replace the drive as we leverage the data redistribution to other drives for protection and sustaining performance while using virtual spare space provided by all of the drives to pick up the gap. Unlike RAID, we don’t need to replace the drive to return the system to full health. You can replace the drive when it’s convenient.
Notice a few things:
- More nodes = less rebuild time! Try this if you scale out alternative options and I think you’ll find the inverse.
- The near linear rebuild performance improves as you add more drives and nodes. Imagine if this showed even more nodes participating in the rebuild process!
- More data density doesn’t result in a linear increase in the rebuild time. As you see in the 12-node configuration, it starts to converge on a vanishing point.
This illustrates what happens when you have 35, 53, and 71 drives participating in the parallel rebuild process for the six, nine and twelve node configurations, respectively.
Node Rebuild (6 drives)
Here we show an example using a similar load level of data on the nodes. The nodes each contain six drives with a maximum of 5.76TB to be rebuilt. The entire cluster of drives participates in taking over the workloads, automatically rearranging the data placement and making sure the cluster always has two copies of the data residing on different nodes. Just as in the above drive rebuild example, the process leverages all the remaining drives from the cluster to take on the rebuild process to return to a fully protected state. That means for the six-node configuration there are 30 drives participating in the parallelized rebuild, 48 drives in the nine-node configuration and 66 drives in the twelve nodes.
Notice again the near linear improvement in rebuild times as you increase the number of nodes and drives. As in the drive rebuild scenario, the node rebuild time observed also tends to approach a vanishing point for the varying data saturation levels.
As mentioned previously, PowerFlex scales to 1000+ nodes. Take a scenario where you need to affect an entire rack of servers and remain operational and recoverable (unthinkable in conventional architectures) and you see why our largest customers depend on PowerFlex.
Testing Details
If the above tests were done just to show off the best rebuild times, we would just run these systems without any actual other work occurring. However, that wouldn’t reflect a real-world scenario where the intention is to continue operating gracefully and still recover to full operational levels.
These tests were done with the PowerFlex default rebuild setting of one concurrent I/O per drive. For customers with more aggressive needs to return to fully protected, PowerFlex can be configured to accelerate rebuilds as a priority. To optimize rebuilds even more than illustrated, you can set the number of concurrent I/Os per drive to two or more or even unlimited. Since changing the number of I/Os per drive does affect latency and IOPS, which could adversely impact workloads, we chose to illustrate our default example that intentionally balances keeping workload performance high while doing the rebuild.
Using FIO* as a storage I/O generator, we ran these rebuild scenarios with ~750k random IOPS of activity on the 12 node configuration, ~600k random IOPS on the 9-nodes and ~400k on the 6-nodes, all while sustaining 0.5mS latency levels (cluster examples here can drive well over 1M IOPS at sub-mS levels). This represents a moderately heavy workload operating while we performed these tests. Even with the I/O generator running and the rebuild process taking place, the CPU load was approximately 20%. The I/O generator alone only consumed 8 to 10% of the available CPU capacity. Both CPU utilization figures underscores the inherent software defined infrastructure efficiency of PowerFlex that leaves a lot of available capacity to host application workloads. In this test case scenario, both the compute and storage occupied the same node (hyperconverged), but remember that we can also run a in 2-layer configuration using compute only and storage only nodes for asymmetrical scaling.
The systems used for these tests had the following configuration. Note that we used six drives per node in the R740xd chassis that can hold 24 drives, which means there were another 18 slots available for additional drives. As noted previously, more drives mean more parallel capabilities for performance and rebuild velocity.
- 12x R740xd nodes with 2 sockets Intel Xeon Gold 2126 2.6Ghz (12 cores /socket)
- Six had 256GB RAM & six utilized 192GB RAM
Conclusion
PowerFlex delivers cloud scale performance with unrivaled grace under pressure reliability for delivering a software defined block storage product with six nines of availability. Be sure to read Part 1 of this blog “Resiliency Explained — Understanding PowerFlex's Self-Healing, Self-Balancing Architecture” to see the other protection architecture elements not covered here. For more information on our validated mission critical workloads like Oracle RAC, SAP HANA, MySQL, MongoDB, SAS, Elastic, VDI, Cassandra and other business differentiating applications, please visit our PowerFlex product site.
Footnotes
* FIO set for 8k, 20% random write, 80% random reads