ECS can be deployed as a single or multiple site instance. The building blocks of an ECS deployment include:
- Virtual Data Center (VDC) - A cluster, also referred to as a site or geographically distinct region, made up of a set of ECS infrastructure managed by a single fabric instance.
- Storage Pool (SP) - SPs can be thought of as a subset of nodes and their associated storage belonging to a VDC. A node can belong to only one SP. EC is set at the SP level with either a 12+4 or 10+2 scheme. A SP can be used as a tool for physically separating data between clients or groups of clients accessing storage on ECS.
- Replication Group (RG) - RGs define where SP content is protected and locations from which data can be accessed. An RG with a single member site is sometimes called a local RG. Data is always protected locally where it is written against disk, node, and rack failures. RGs with two or more sites are often called global RGs. Global RGs span up to 8 VDCs and protect against disk, node, rack, and site failures. A VDC can belong to multiple RGs.
- Namespace - A namespace is conceptually the same as a tenant in ECS. A key characteristic of a namespace is that users from one namespace cannot access objects in another namespace.
- Buckets - Buckets are containers for objects created in a namespace and sometimes considered a logical container for sub-tenants. In S3, containers are called buckets and this term has been adopted by ECS. In Atmos, the equivalent of a bucket is a subtenant; in Swift, the equivalent of a bucket is a container, and for CAS, a bucket is a CAS pool. Buckets are global resources in ECS. Each bucket is created in a namespace and each namespace is created in an RG.
ECS leverages the following infrastructure systems:
- DNS - (required) Forward and reverse lookups required for each ECS node.
- NTP - (required) Network Time Protocol server.
- SMTP - (optional) Simple Mail Transfer Protocol Server for sending alerts and reporting.
- DHCP - (optional) Required if assigning IP addresses using DHCP.
- Authentication Providers - (optional) ECS administrators can be authenticated using Active Directory and LDAP groups. Object users can be authenticated using Keystone. Authentication providers are not required for ECS. ECS has local user management functionality built-in however do note that users created locally are not replicated between VDCs.
- Load Balancer - (required if workflow dictates, otherwise optional) Client load should be distributed across nodes to effectively use all resources available in the system. If a dedicated load balancer appliance or service is needed to manage the load across ECS nodes, it should be considered a requirement. Developers writing applications using the ECS S3 SDK can take advantage of its built-in load balancer functionality. Sophisticated load balancers may take additional factors into account, such as a server's reported load, response times, up/down status, number of active connections and geographic location. The customer is responsible for managing client traffic and determining access requirements. Regardless of method there are a few basic options that are generally considered including manual IP allocation, DNS Round Robin, Client-Side Load Balancing, Load Balancer Appliances and Geographic Load Balancers. The following are brief descriptions of each of those methods:
- Manual IP allocation - IP addresses are manually distributed to applications. This is generally not recommended as it may not distribute load or provide fault-tolerance.
- DNS round-robin - A DNS entry is created that includes all node IP addresses. Clients query DNS to resolve fully-qualified domain names for ECS services and are answered with the IP addresses of a random node. This may provide some pseudo-load balancing. This method may not provide fault-tolerance because often manual intervention is used to remove IP addresses of failed nodes from DNS. Time to live (TTL) issues may be encountered with this method. Some DNS server implementations may cache DNS lookups for a period such that clients connecting in a close timeframe may bind to the same IP address, reducing the amount of load distribution to the data nodes. Using DNS for distributing traffic in a round-robin fashion is not recommended.
- Load balancing - Load balancers are the most common approach to distributing client load. Clients can send traffic to a load balancer which receives and forwards it on to a healthy ECS node. Proactive health checks or connection state are used to verify each node’s availability to service requests. Unavailable nodes are removed from use until they pass a health check. Offloading CPU-intensive SSL processing can be used to free up those resources on ECS.
- Geographic load balancing - Geographic load balancing leverages DNS to route lookups to an appliance like the Riverbed SteelApp, for example, which use Geo-IP or another mechanism to determine the best site to route the client to.