Dell ECS: Data Movement (Copy to Cloud)
Tue, 18 Apr 2023 15:35:19 -0000
|Read Time: 0 minutes
As unstructured data continues to grow exponentially, organizations face challenges of managing and analyzing data in object storage. At some point, it might become necessary to move the data to another ECS cluster or a public cloud to manage object data flexibly and efficiently.
Data movement, also called copy-to-cloud, is a new feature in ECS 3.8.0.1 where a user can copy local object data to an external S3 target, such as a nonfederated ECS or a public hyperscaler. (Currently, only AWS targets are supported.)
The data movement feature allows copying object data to an external S3 target. It is built on the ECS Sync open-source tool, which provides the capability to copy data in parallel. The data movement feature only supports IAM accounts and IAM buckets. Figure 1 shows the data movement solution architecture.
Figure 1. Data movement architecture
Data movement is configured as a bucket option in the UI, as shown in figure 2. It can be monitored by an account admin or system admin within the UI. The admin can define policies about source and target buckets and criteria for objects. The admin can also monitor the logs for all copy operations at the object level, including the copy time, source object key, object size, target endpoint, duration, and result of the copy operation (success/failure, error message). There are also alerts that show a summary of all copy operations and errors on any failures.
Note: The target bucket must exist at ECS bucket creation time.
Figure 2. Config data movement in a bucket
The data movement service can only run in Gen2 or later systems that have been upgraded to 192 GB memory. Data movement policies cannot sync deletes. This means that if an object is deleted from the source bucket, it will not be deleted from the target bucket. The data movement policy runs once an hour; if there are many versions within the hour, only the latest one is copied.
Dell is extending the ecosystem to support a multicloud experience for Snowflake, which runs in AWS. Dell and Snowflake customers can use on-premises data stored on Dell ECS while keeping their data local or seamlessly copying it to public clouds to leverage Snowflake’s ecosystem of cloud-based data analysis services.
The following workflow shows how Snowflake works with ECS data movement:
Figure 3. Data movement with Snowflake
- An application writes data to an ECS local bucket.
- A data movement policy in ECS is configured to copy all data or a subset of the data to a customer-owned predefined staging bucket within AWS.
- Data is written to the staging bucket.
- This AWS bucket will have S3 notifications configured to notify AWS SQS queue, to which Snowflake is subscribed.
- A Snowflake data pipeline process called Snowpipe wakes up and ingests the data into Snowflake.
- Data can then be deleted according to life cycle policy in AWS.
Conclusion
Whatever the use case for data transfer may be, getting it done fast, reliably, securely, and consistently is important. And no matter how much data you have to move, where it’s located, or how much bandwidth you have, there is an option that can work for you. For a more in-depth look, check out the documentation:
Related Blog Posts
Introducing the Latest Release of Dell ECS 3.8.1
Tue, 02 Apr 2024 18:32:26 -0000
|Read Time: 0 minutes
IDC predicts that the Global Datasphere will grow to 221 zettabytes by 2026, more than 90% of which is unstructured in nature. ECS, the leading enterprise-class object storage platform from Dell Technologies, has been engineered to support both traditional and next-generation workloads. ECS delivers the capabilities of the public cloud with the command and control of a private cloud infrastructure as an S3-compatible, globally scalable object store.
New Features in ECS 3.8.1
The latest release of ECS 3.8.1 introduces a range of innovative features and enhancements that make significant strides in advancing the field of enterprise object storage solutions.
Azure AD OBO Support
Today more customers are moving to Azure AD, and more apps are using OIDC (OpenID Connect), so that they can talk to a service provider like ECS that supports SAML (Security Assertion Markup Language). Apps in this environment are using an Azure AD On Behalf Of (OBO) workflow to exchange their OIDC token for a SAML assertion. With the support of this new workflow, our customers can integrate their S3 applications to authenticate identity.
The OAuth 2.0 On-Behalf-Of flow (OBO) is ideal for use cases where an application invokes a service/web API and needs to call another service/web API. The idea is to propagate the delegated user identity and permissions through the request chain. For the middle-tier service to make authenticated requests to the downstream service, it needs to secure an access token from the Microsoft identity platform, on behalf of the user.
ECS IAM features S3 work with SAML identity providers to handle authentication and SAML Assertion generation. It provides the following for applications:
- Authenticate with an identity provider and if successful receive a SAML Assertion
- Use the SAML Assertion to make a call to the ECS Secure Token Service (STS) API (AssumeRoleWithSAML Method) to retrieve a temporary set of credentials thus allowing the caller to assume a role
- Perform ECS API calls that the role allows using the temporary credentials.
Note: In the SAML model there are two main roles for a participant: Identify Provider and Service Provider. Based on the ECS IAM SAML design ECS acts as the Service Provider and Azure AD acts as the Identity Provider and generates SAML Assertions.
HDFS Deprecated in ECS
Starting from ECS 3.8.1, we will remove the HDFS support, because customers have moved to ECS for Hadoop by S3a. We made the below changes:
- Disabled HDFS head port (9040) on ECS
- Removed CMF (Configuration Framework) support for port enable/disable with HDFS
- Discontinued Datahead service on port 9040
- Removed HDFS client jar file from downloads
Note: When ECS is upgraded to 3.8.1, HDFS will stop working if it is being used.
New CAS Consistency Option
ECS uses strong consistency, and object concurrent conflicts are resolved by redirecting all object operations to the object owner. However, operations might experience additional latency if an object or bucket owner is at a remote VDC. The is a problem for some CAS applications that are very sensitive to read latency.
Starting from ECS 3.8.1, we introduce a new feature which allows customers to set a CAS bucket with a new consistency mode. The greatest benefit to customers is improving read latency when an object or bucket owner is at a remote VDC (virtual Data Center). With the new feature enabled, read performance should improve, because ECS will no longer check the source bucket or object owner that strong consistency typically requires.
Below is the UI configuration page to create a CAS bucket with the CAS consistency mode. The new CAS consistency only supports the CAS buckets which enable ADO RW. The create, update, and delete operations are still redirected to object owner zone.
Simplified Bucket Deletion
The task of deleting a bucket is simplified by incorporating the object deletion process. Customers no longer need to empty a bucket prior to requesting a bucket deletion. Through the user interface, a customer may delete a bucket in ECS, even if it is not empty.
A new UI is introduced about deleting bucket dialog as below. During S3 bucket deleting,
- Bucket access is set to read only.
- No property changes on bucket are allowed.
- MPUs are aborted.
- Object/versions are removed.
- User permissions, object lock, governance, compliance, retention, and legal hold are honored during delete.
- Once all objects are deleted, the bucket will be removed.
- If objects cannot be deleted, the bucket will be put back into writable state.
A filesystem enabled bucket is also supported in the simplified bucket deletion feature. NFS exports must be removed before deleting. During the deletion process, NFS access will not be allowed (read or write).
Fabric Improvements with Mixed Memory Cluster
The ECS fabric improvements with mixed memory cluster features resolves the problem where 192 GB nodes get set to a 64 GB profile. This occurred when certain service procedures were run like node expansions, node replacement or software upgrades. With this improvement, service procedures leave the allocated memory profile aligned to what's available physically on the node.
Conclusion
Dell ECS offers a sophisticated solution for deploying and managing enterprise-grade object storage. With its well-designed architecture and robust protective features, it presents a compelling option for organizations in pursuit of flexibility, scalability, performance, and security in their object storage solutions.
Please refer to the ECS 3.8.1 release note for more information about the new features.
What's new in ECS 3.8.0.1
Thu, 08 Dec 2022 15:40:00 -0000
|Read Time: 0 minutes
ECS 3.8.0.1 includes these new and updated features:
- Increased data mobility and flexibility with new data mobility feature
- Expanded external key management (KMIP) support
- Object Lock enhancements for ADO
- Security Token Service (STS) GetFederationToken support
- New hardware disk sizing option for EX500
- Memory upgrade expansion
Data mobility
Data mobility, also known as copy-to-cloud, is a new feature in version 3.8.0.1. With data mobility, a user can copy local object data to an external S3 target, such as a secondary ECS that is not federated or a public cloud target (currently AWS targets only).
Data mobility is configured as a bucket option in the UI, as shown in Figure 1. It can be monitored by an account admin or system admin within the UI. The admin can define policies about source and target buckets and criteria for objects. The admin can also monitor the logs for all copy operations at the object level, including the copy time, source object key, object size, target endpoint, duration, and result of the copy operation.
Figure 1 Data mobility configuration
We have also extended our ecosystem to support a multi-cloud experience for Snowflake, which runs on the AWS public cloud platform. Dell and Snowflake customers can use on-premises data stored on Dell ECS while keeping their data local or seamlessly copying it to public clouds and use Snowflake’s ecosystem of cloud-based data analysis services.
Expanded external key management support
The Key Management Interoperability Protocol (KMIP) is an extensible communication protocol that defines message formats for the manipulation of cryptographic keys on a key management server.
A new external key management cluster type to support Thales CipherTrust has been added since Gemalto SafeNet KeySecure will end of life on December 31, 2023. ECS customers who are using KeySecure can migrate to CipherTrust Manager.
Object Lock support with ADO enabled
ECS Object Lock enhancements for Access During Outage (ADO) have been added in ECS 3.8.0.1. Object Lock now supports ADO Read Only (RO) by default. For Read Write (RW) mode, ECS continues to deny setting Object Lock on ADO buckets by default. There are flags at the namespace and individual bucket level that require users to agree that they understand the risk of losing locked versions during TSO but would still like to enable this feature. Customers need to refer to the latest ECS Data Access Guide or ask the Dell support team to help enable Object Lock RW in ADO. Once flags are set on a bucket, they cannot be disabled.
The following table shows the Object Lock support matrix:
ECS version | Setting flags | Non-ADO | ADO RO | ADO RW |
3.6.2/3.7/partial upgrade
| Cannot set flags
| Yes | No | No |
Full upgrade to 3.8.0.1 | Set to not allowed (by default)
| Yes | Yes | No |
Set to allowed
| Yes | Yes | Yes |
STS GetFederationToken support
The GetFederationToken API is now part of Security Token Service (STS), along with AssumeRole and AssumeRoleWithSAML. It is called by the IAM user, and it returns a set of temporary security credentials (consisting of an access key ID, a secret access key, and a security token) for that user. This operation federates the user. A typical use is in a proxy application that gets temporary security credentials on behalf of distributed applications inside a corporate network. The ECS 3.8 Administration Guide on Dell Technologies Support provides more details.
New hardware disk sizing option
ECS version 3.8.0.1 extends disk sizing support. Customers can now select an ECS EX500 20 TB disk option.
Memory upgrade expansion
ECS version 3.8.0.1 supports memory upgrade expansion to 192 GB on EX300, EX500, EX3000, and Gen 2 platforms, with the support of Dell Professional Services. For more information, contact your ECS Customer Service representative.
Conclusion
ECS version 3.8.0.1 introduces new and updated platform features. It reiterates the core value proposition of ECS and our broader object portfolio, amplifying the benefits of the new capabilities in the 3.8.0.1 payload and related hardware updates.
For more information, see the ECS: Overview and Architecture White Paper on the ObjectScale and ECS Info Hub.