The future of Cloud-Native infrastructure is Resilient and Flexible
Mon, 13 Dec 2021 18:40:31 -0000|
Read Time: 0 minutes
Next generation infrastructures to support Cloud-Native workloads must be resilient and flexible to satisfy workload requirements while also reducing the management burden on IT staffers.
While much of the emphasis on the benefits of Cloud-Native infrastructure are focused on speed and agility from development to deployment, the rise of stateful containerized applications will force organizations to take resiliency, storage performance and data services more seriously. In the Voice of the Enterprise: DevOps, Workloads & Projects 2020 study, 56% of organizations have more than 50% applications that are stateful and this trend will rise as more production workloads run on containers.
The need for persistent storage also raises the stakes for data protection capabilities such as snapshots, replication, backup and disaster recovery. Even when it comes to non-mission critical and non-business critical workloads such as test/dev, organizations have minimal tolerance for downtime or data loss. The rising customer expectations for resiliency will only increase pressure on organizations to implement storage systems with rich data protection capabilities and the ability to automate the deployment of these features based on the importance of a particular workload.
Data placement and optimization continue to be key concerns in large scale environments, and it is important for next generation systems to provide intelligent load balancing to position data across nodes in a manner that makes optimal use of resources. These data placement capabilities need to be automated, since many of these operations will occur in the background when workloads are not as active.
Though it is tempting to go with a clean sheet approach when designing next generation infrastructures for emerging Cloud-Native workloads, workloads that are branded as “legacy” do not disappear, even if they are not top of mind in planning discussions. In interactions with organizations building out Cloud-Native infrastructures, it is far more common for them to be running their containerized workloads on top of or inside of VMs today, as opposed to building a new silo of infrastructure for Cloud-Native.
Just as VMs have not completely displaced workloads running on non-virtualized physical systems, we are still a long way from seeing all of the applications currently running in VMs shifting over completely to containers. Infrastructures which have the flexibility to provide compute and storage resources for physical, virtualized, and containerized workloads simultaneously will be necessary for many years.
For more information, please read the 451 Research Special Report:
Author: Henry Baltazar
Copyright © 2021 S&P Global Market Intelligence.
The content of this artifact is for educational purposes only. 451 Research, S&P Global Market Intelligence does not endorse any companies, technologies, products, services, or solutions.
Related Blog Posts
Multi-cloud Protection with PowerProtect Data Manager
Thu, 14 Apr 2022 20:12:59 -0000|
Read Time: 0 minutes
What I like most about PowerProtect Data Manager is that it supports the rising demand for data protection for all kind of organizations. It’s powerful, efficient, scalable and most importantly: a simple-to-use solution. And what could be simpler than using the same product with the same user interface on any environment, including any supported cloud platform?
PowerProtect Data Manager is usually used for deploying and protecting on-prem virtual machines running on VMware vSphere environments.
While PowerProtect Data Manager excels in protecting any on-prem machines and different types of technologies, such as Kubernetes, some organizations also have a cloud strategy where some or all their workloads and services are running on the cloud.
There are also organizations that use multiple cloud platforms to host and manage their workloads, and these resources need to be protected as well, especially in the cloud where there could be additional risk management and security considerations.
The good news is that PowerProtect Data Manager provides cloud and backup admins the same abilities and interface across all the supported cloud platforms: Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP).
AWS users can use the AWS Marketplace to deploy “Dell EMC PowerProtect Data Manager and PowerProtect DD Virtual Edition” which will trigger an automated deployment using the AWS CloudFormation service.
In this deployment method you’re asked to provide all the networking and security details ahead, and then it does everything else for you, including deploying a DDVE instance that will manage the backup copies for you (with deduplication!).
Once the CloudFormation stack is deployed, you can access the PowerProtect Data Manager through any web browser, and then add and protect your cloud resources, just as if it were an on-prem deployment – super intuitive and super easy!!
I think the trickiest part in the deployment is probably to make sure that all of the networking and firewall or other security and policy restrictions allow you to connect to the PowerProtect Data Manager VM and to the DDVE.
Check out this great whitepaper that describes the entire process of deploying PowerProtect Data Manager on AWS.
For Microsoft Azure users, the process here is similar. You can deploy PowerProtect Data Manager using the Azure Marketplace service:
This whitepaper will take you through the exact steps required to successfully deploy PowerProtect Data Manager and PowerProtect DDVE on your Azure subscription.
Didn’t I say it’s really easy and works the same way in all the cloud platforms?
GCP users can use the GCP Marketplace to deploy their PowerProtect Data Manager:
This whitepaper describes the entire deployment process with detailed screenshots on GCP.
Now you can easily protect your multi-cloud resources with the same powerful protection solution!
Author: Eli Persin
How PowerFlex Transforms Big Data with VMware Tanzu Greenplum
Wed, 13 Apr 2022 13:16:23 -0000|
Read Time: 0 minutes
Quick! The word has just come down. There is a new initiative that requires a massively parallel processing (MPP) database, and you are in charge of implementing it. What are you going to do? Luckily, you know the answer. You also just discovered that the Dell PowerFlex Solutions team has you covered with a solutions guide for VMware Tanzu Greenplum.
What is in the solutions guide and how will it help with an MPP database? This blog provides the answer. We look at what Greenplum is and how to leverage Dell PowerFlex for both the storage and compute resources in Greenplum.
Infrastructure flexibility: PowerFlex
If you have read my other blogs or are familiar with PowerFlex, you know it has powerful transmorphic properties. For example, PowerFlex nodes sometimes function as both storage and compute, like hyperconverged infrastructure (HCI). At other times, PowerFlex functions as a storage-only (SO) node or a compute-only (CO) node. Even more interesting, these node types can be mixed and matched in the same environment to meet the needs of the organization and the workloads that they run.
This transmorphic property of PowerFlex is helpful in a Greenplum deployment, especially with the configuration described in the solutions guide. Because the deployment is built on open-source PostgreSQL, it is optimized for the needs of an MPP database, like Greenplum. PowerFlex can deliver the compute performance necessary to support massive data IO with its CO nodes. The PowerFlex infrastructure can also support workloads running on CO nodes or nodes that combine compute and storage (hybrid nodes). By leveraging the malleable nature of PowerFlex, no additional silos are needed in the data center, and it may even help remove existing ones.
The architecture used in the solutions guide consists of 12 CO nodes and 10 SO nodes. The CO nodes have VMware ESXi installed on them, with Greenplum instances deployed on top. There are 10 segments and one director deployed for the Greenplum environment. The 12th CO node is used for redundancy.
The storage tier uses the 10 SO nodes to deliver 12 volumes backed by SSDs. This configuration creates a high speed, highly redundant storage system that is needed for Greenplum. Also, two protection domains are used to provide both primary and mirror storage for the Greenplum instances. Greenplum mirrors the volumes between those protection domains, adding an additional level of protection to the environment, as shown in the following figure:
By using this fluid and composable architecture, the components can be scaled independently of one another, allowing for storage to be increased either independently or together with compute. Administrators can use this configuration to optimize usage and deliver appropriate resources as needed without creating silos in the environment.
Testing and validation with Greenplum: we have you covered
The solutions guide not only describes how to build a Greenplum environment, it also addresses testing, which many administrators want to perform before they finish a build. The guide covers performing basic validations with FIO and gpcheckperf. In the simplest terms, these tools ensure that IO, memory, and network performance are acceptable. The FIO tests that were run for the guide showed that the HBA was fully saturated, maximizing both read and write operations. The gpcheckperf testing showed a performance of 14,283.62 MB/sec for write workloads.
Wouldn’t you feel better if a Greenplum environment was tested with a real-world dataset? That is, taking it beyond just the minimum, maximum, and average numbers? The great news is that the architecture was tested that way! Our Dell Digital team has developed an internal test suite running static benchmarked data. This test suite is used at Dell Technologies across new Greenplum environments as the gold standard for new deployments.
In this test design, all the datasets and queries are static. This scenario allows for a consistent measurement of the environment from one run to the next. It also provides a baseline of an environment that can be used over time to see how its performance has changed -- for example, if the environment sped up or slowed down following a software update.
Massive performance with real data
So how did the architecture fare? It did very well! When 182 parallel complex queries were run simultaneously to stress the system, it took just under 12 minutes for the test to run. In that time, the environment had a read bandwidth of 40 GB/s and a write bandwidth of 10 GB/s. These results are using actual production-based queries from the Dell Digital team workload. These results are close to saturating the network bandwidth for the environment, which indicates that there are no storage bottlenecks.
The design covered in this solution guide goes beyond simply verifying that the environment can handle the workload; it also shows how the configuration can maintain performance during ongoing operations.
Maintaining performance with snapshots
One of the key areas that we tested was the impact of snapshots on performance. Snapshots are a frequent operation in data centers and are used to create test copies of data as well as a source for backups. For this reason, consider the impact of snapshots on MPP databases when looking at an environment, not just how fast the database performs when it is first deployed.
In our testing, we used the native snapshot capabilities of PowerFlex to measure the impact that snapshots have on performance. Using PowerFlex snapshots provides significant flexibility in data protection and cloning operations that are commonly performed in data centers.
We found that when the first storage-consistent snapshot of the database volumes was taken, the test took 45 seconds longer to complete than initial tests. This result was because it was the first snapshot of the volumes. Follow-on snapshots during testing resulted in minimal impact to the environment. This minimal impact is significant for MPP databases in which performance is important. (Of course, performance can vary with each deployment.)
We hope that these findings help administrators who are building a Greenplum environment feel more at ease. You not only have a solution guide to refer to as you architect the environment, you can be confident that it was built on best-in-class infrastructure and validated using common testing tools and real-world queries.
The bottom line
Now that you know the assignment is coming to build an MPP database using VMware Tanzu Greenplum -- are you up to the challenge?
If you are, be sure to read the solution guide. If you need additional guidance on building your Greenplum environment on PowerFlex, be sure to reach out to your Dell representative.