Home > Storage > ObjectScale and ECS > Product Documentation > Dell ECS: Technical FAQ > General FAQ
Question: What are the durability and availability numbers for ECS? How many nines does ECS guarantee?
Data durability of a storage system provides guarantees for data being stored in the system without loss or corruption. ECS supports local and multi-site protection of data, and regular systematic data integrity checks with self-healing capability. ECS durability is 99.999999999 (eleven 9s).
Data availability of a storage system provides guarantees for the system to successfully process data read/write requests and depends on factors outside of ECS control, including equipment/connectivity/power failures. ECS availability is 99.999 (five 9s) based on request-error-rate estimates.
Question: What is a Chunk?
Data and system metadata are written in chunks on ECS. An ECS chunk is a 128MB logical container of contiguous space. Each Replication Group will have its own allocation of chunks for buckets and objects associated with the Replication Group. Each chunk can have data from different objects. ECS uses indexing to keep track of all parts of an object that might be spread across different chunks and nodes.
Chunks are written in an append-only pattern. The append-only behavior means that an application’s request to modify or update an existing object will not modify or delete the previously written data within a chunk, but rather the new modifications or updates will be written in a new chunk. For that reason, no locking is required for I/O and no cache invalidation is required.
Question: How does ECS protect data?
At the heart of ECS software is a Storage Engine that is responsible for laying out all data in 128 MB chunks across the system. User data, metadata, and system data are written to a different logical chunk that will contain 128 MB of data. Chunks are both triple-mirrored and erasure coded.
Metadata is written to a chunk; journal creates three copies. Each copy is written to a single disk on different nodes and btree is 12 + 4 in addition to another 12 data fragments.
User data less than 128 MB in size is initially written using triple mirror, with one copy being written in preparation for erasure coding.
Triple mirroring plus in place erasure coding is applicable to a chunk containing the data from any object that is less than 128 MB in size. ECS creates three replica copies of a chunk that contains user data. One copy is written in fragments that are spread across different nodes within the cluster. The remaining two copies are written in their entirety to different nodes. After a chunk is sealed parity is calculated and written to disk, after which the two replica copies written to individual nodes are removed. This process optimizes write performance for small objects, initially using triple mirroring for protection but ultimately leaves the chunk protected by erasure coding.
Inline erasure coding is used for objects that are 128 MB or larger. This process calculates parity as part of the initial write which is distributed across nodes in the Virtual Data Center (VDC). This process does not create replica copies which optimizes large write performance and saves disk I/O.
For geographically distributed systems, replication group policies determine how data is protected and where it can be accessed from. Data that is geo-replicated is protected by storing a primary copy of the data at the local site and a secondary copy of data at one or more remote sites. Each site is responsible for local data protection, meaning that both the local and secondary site will individually protect the data using erasure coding and/or triple mirroring. Replication is performed asynchronously, and data is added to a replication queue as it is written to the primary site. There are worker I/O threads continuously processing the queue. With more than two sites in a replication group, where “Replicate to All Sites" is off, an XOR mechanism can be used which serves to reduce overhead significantly. See the Dell ECS Overview and Architecture Guide for more details.
Question: What are the limitations regarding Storage Pool, VDCs, RGs, Namespaces, Buckets, Objects?
The storage that is associated with a VDC must be assigned to a storage pool and the storage pool must be assigned to one or more replication groups to allow the creation of buckets and objects. While the VDC can have a storage pool defined for each of the two Erasure Coding schemes (12+4 with minimum of five nodes or 10+2 with minimum of six nodes), the best practice is to have all nodes associated to one storage pool for a given VDC. A node cannot exist in more than one storage pool. The storage pool can span racks, but it is always within a site. When the storage pool reaches 90% of its total capacity, it does not accept write requests and it becomes a read-only system. A storage pool can be associated with more than one replication group.
The maximum number of VDC’s per ECS federation and/or replication group is eight.
Replication groups are limited to the possible variations in topology of federated VDCs. Best practice is to create replication groups that match the access and protection requirements for the data being written across the federated sites. Most customers create a local replication group for a VDC where data should be kept local to the VDC and one or more geo-replicated groups consisting of two or more VDCs for geo-protection of the data. Each replication group has chunks allocated for the buckets and objects associated to it.
There are no limitations to the number of namespaces that can be created. A namespace has a set of configured users who can store and access objects within the namespace. Users from one namespace cannot access the objects that belong to another namespace. An object in one namespace can have the same name as an object in another namespace. ECS can identify objects by the namespace qualifier.
There are no limitations to the number of buckets that can be created, nor are there limitations to the number of objects that can be created in a given bucket. A bucket is assigned to a namespace and object users are also assigned to a namespace. An object user can create buckets only in the namespace to which the object user is assigned.
Question: Which hardware models are supported by ECS Tech Refresh?
ECS tech refresh supports the below hardware models which running on 3.7.x and higher ECS code.
Note: EX300 and EX3000 were EOL in 2022. It is recommended that you refresh to EX500 or EX5000 at this time.
Note: ECS OS versions on new nodes must match the existing nodes in the VDC. The ECS code is pushed from existing nodes to the new nodes.
Question: What are the detailed steps of ECS Tech Refresh?
Tech Refresh is a data-in-place service with automated procedures introduced in ECS v3.7.x. The technical steps for doing tech refresh include:
Rack Extend:
Data Migration:
Node Evacuation:
Checks after tech refresh:
For more detail, see the Professional Servers Procedures under SolVe.
Question: What are the types of Garbage Collection and how are they defined?
There are two garbage collection methods used by ECS to reclaim space from discarded chunks and depending on whether said chunks consist of entirely deleted objects, or a mixture of deleted and non-deleted objects. They are:
Question: What automation platforms are supported and what is included?
Although Ansible is not officially supported with ECS today, it is, however, possible to use Ansible Playbooks to automate the creation of namespaces, object users, buckets, and so on.
Question: What kind of file types are supported by S3 select?
S3 select supports csv, JSON and parquet formats and it supports querying gzip/bzip2 compressed objects of the aforementioned file types.