Dell ECS: Data Movement (Copy to Cloud)
Tue, 18 Apr 2023 15:35:19 -0000
|Read Time: 0 minutes
As unstructured data continues to grow exponentially, organizations face challenges of managing and analyzing data in object storage. At some point, it might become necessary to move the data to another ECS cluster or a public cloud to manage object data flexibly and efficiently.
Data movement, also called copy-to-cloud, is a new feature in ECS 3.8.0.1 where a user can copy local object data to an external S3 target, such as a nonfederated ECS or a public hyperscaler. (Currently, only AWS targets are supported.)
The data movement feature allows copying object data to an external S3 target. It is built on the ECS Sync open-source tool, which provides the capability to copy data in parallel. The data movement feature only supports IAM accounts and IAM buckets. Figure 1 shows the data movement solution architecture.
Figure 1. Data movement architecture
Data movement is configured as a bucket option in the UI, as shown in figure 2. It can be monitored by an account admin or system admin within the UI. The admin can define policies about source and target buckets and criteria for objects. The admin can also monitor the logs for all copy operations at the object level, including the copy time, source object key, object size, target endpoint, duration, and result of the copy operation (success/failure, error message). There are also alerts that show a summary of all copy operations and errors on any failures.
Note: The target bucket must exist at ECS bucket creation time.
Figure 2. Config data movement in a bucket
The data movement service can only run in Gen2 or later systems that have been upgraded to 192 GB memory. Data movement policies cannot sync deletes. This means that if an object is deleted from the source bucket, it will not be deleted from the target bucket. The data movement policy runs once an hour; if there are many versions within the hour, only the latest one is copied.
Dell is extending the ecosystem to support a multicloud experience for Snowflake, which runs in AWS. Dell and Snowflake customers can use on-premises data stored on Dell ECS while keeping their data local or seamlessly copying it to public clouds to leverage Snowflake’s ecosystem of cloud-based data analysis services.
The following workflow shows how Snowflake works with ECS data movement:
Figure 3. Data movement with Snowflake
- An application writes data to an ECS local bucket.
- A data movement policy in ECS is configured to copy all data or a subset of the data to a customer-owned predefined staging bucket within AWS.
- Data is written to the staging bucket.
- This AWS bucket will have S3 notifications configured to notify AWS SQS queue, to which Snowflake is subscribed.
- A Snowflake data pipeline process called Snowpipe wakes up and ingests the data into Snowflake.
- Data can then be deleted according to life cycle policy in AWS.
Conclusion
Whatever the use case for data transfer may be, getting it done fast, reliably, securely, and consistently is important. And no matter how much data you have to move, where it’s located, or how much bandwidth you have, there is an option that can work for you. For a more in-depth look, check out the documentation: