When a write operation happens in ECS, it starts with a client sending a request to a node. ECS was designed as a distributed architecture, allowing any node in a VDC/site to respond to a read or write request. A write request involves writing the object data and custom object metadata, and recording the transaction in a journal log. Once this activity is complete, an acknowledgment is sent to the client.
The following figure and steps provide a high-level overview of a write workflow.
Figure 5. Object write workflow
- A write object request is received. Any node can respond to this request. In this example, Node 1 processes the request.
- Depending on the size of the object, the data is written to one or more chunks. Each chunk is protected using advanced data protection schemes such as triple mirroring and erasure coding. Before writing the data to disk, ECS runs a checksum function and stores the result.
The data is added to a chunk. Because this object is only 10 MB in size, it uses the triple mirroring plus in-place erasure coding scheme. This scheme results in writes to three disks on three different nodes—Node 2, Node 3, and Node 4 in this example. These three nodes send acknowledgments back to Node 1.
- After the object data is written successfully, the object metadata is stored. In this example, Node 3 owns the partition of the object table in which this object belongs. As owner. Node 3 writes the object name and chunk ID to this partition of the object table’s journal logs. Journal logs are triple mirrored, so Node 3 sends replica copies to three different nodes in parallel—Node 1, Node 2, and Node 3 in this example.
- Acknowledgment is sent to the client.
- In a background process, the memory table is updated.
- Once the table in memory becomes full or after a set period of time, the table is merged, sorted, or dumped in B+ trees as chunks. Then, a checkpoint is recorded in the journal.