Container platform overview

The container platform in this architecture is Symcloud Platform, which is based on Kubernetes. The container platform abstracts the machine level details and host operating system dependencies, exposing them as a pool of compute, storage, and communications resources. It also provides capabilities to streamline the deployment, scaling, and life cycle management of analytics workloads. The container platform implementation supports the agility, flexibility, and scalability of the architecture while supporting diverse analytics workloads.

Symcloud Platform roles

Symcloud Platform uses a role-based deployment model. Three roles can be assigned to nodes:

Manager
Compute
Storage

These roles can be assigned to nodes depending on their configuration and intended usage. The roles are not mutually exclusive. Dell Technologies recommends the role assignments that are shown in Recommended Symcloud Platform role assignments.

Table 14. Recommended Symcloud Platform role assignments
Symcloud Platform Role	Recommended node	Description
Manager	Control Plane 1	The Manager role provides the core control plane services that the cluster requires. The Manager role can be assigned to a maximum of three nodes. High availability deployments require three control plane nodes.
	Control plane 2
	Control plane 3
Compute	Worker 1	The Compute role indicates a node that is intended to host pods and their respective containers. Application deployments use memory and processor resources from these nodes.
	Worker 2
	Worker 3
	Worker 4
	Worker <n>
Storage	Worker 1	The Storage role indicates a node that is intended to provide storage services through Symcloud Storage. Volumes that are needed for deployed applications are created on devices from these nodes.
	Worker 2
	Worker 3
	Worker 4
	Worker <n>

Orchestration

The platform provides application orchestration capabilities to streamline the deployment and management of analytics workloads. The Symcloud Application Workflow Manager supports end-to-end automation and can deploy entire application pipelines seamlessly. Applications on the platform should be containerized and are run under Kubernetes as containers inside pods.

Applications are deployed using the Symcloud application bundle framework. Using this deployment framework, an application bundle contains all the resources necessary to deploy an application. Applications that are deployed with application bundles are known to the platform and integrated into the advanced application orchestration and management capabilities.

When an application is launched, the Symcloud scheduler provisions the required compute, storage, and network resources and then starts the application pods. Since the platform is aware of the entire application, the scheduler can use advanced placement techniques, including data locality, data affinity, and infrastructure awareness.

Once the application starts it can be managed using the platform life cycle management capabilities, including: resource scaling, back-ups, and snapshots.

Helm charts can also be used to deploy applications. After a Helm release is registered with the platform, the life cycle management capabilities are available to applications using Symcloud Storage.

For more detail on application bundles, see Workload design.

Tenants and resource management

The platform uses the concept of a tenant to organize users into groups. Tenants can be organized according to functional or business requirements. System resources are assigned to each tenant, and applications are installed and run within the limits of the tenant resources. Resource pools are used to define the available compute and storage resources for a tenant. IP pools are used to define the available network resources.

Virtualization

The platform supports Linux Kernel-based Virtual Machines (KVM) alongside containerized applications. This capability supports running workloads that have not been containerized.

Application bundles can be created that specify KVM parameters and the corresponding qcow2 image that the virtual machine uses. Once virtual machines launch, they are treated like any other application on the platform, with the full range of life cycle management operations available.

Compute

When an application is launched, the Symcloud scheduler provisions the application pods on nodes with the Compute role that have the appropriate processor and memory resources. The tenant resource pool and current utilization define the available resources. Compute resources are allocated at the container level. The application bundle specifies the amount of memory that is required and the processor core count that is wanted.

GPU accelerators

An application can also request GPU resources. In this case, the GPU type is in the application bundle. At application launch time, GPU resource requirements are in the scheduling decision. Entire GPUs and multiple-instance GPU resources can be requested. Multiple GPUs are supported, and nodes in the cluster can have varying GPU models and quantities. GPU resources are allocated and dedicated to the container for the life of the container.

The platform supports NVIDIA GPUs through the NVIDIA GPU driver that is installed at the operating system level, and the NVIDIA GPU operator for Kubernetes. To use GPU acceleration, application images must also include support libraries that implement GPU support at the user level. For example, NVIDIA CUDA and the NVIDIA RAPIDS Accelerator for Apache Spark.

Storage

The platform supports several different types of storage for ephemeral and persistent storage, and for data lakehouse storage.

Symcloud Storage

Symcloud Storage is a scalable, high-performance software defined storage system that is Kubernetes Container Storage Interface (CSI) compatible. It provides data resiliency through replication, and supports encryption, compression, and thin-provisioning. Symcloud Storage is application and infrastructure aware, allowing it to support data locality, snapshots, and backups for applications running on the platform. Symcloud Storage is hosted on nodes with the storage role. At installation time, it discovers available disks and pools them to provide storage to applications.

Symcloud Storage uses volumes as the unit of allocation. These volumes are analogous to Kubernetes PersistentVolume objects. The replication factor, encryption, and compression properties are specified when a volume is created. The storage class can also specify the preferred media type, either HDD, SSD, or NVMe.

Symcloud Storage is exposed to applications through the Kubernetes CSI using a Kubernetes StorageClass object. Symcloud Storage ships with three predefined StorageClasses:

robin—The default StorageClass that has no advanced features and can be used for standard ReadWriteOnly (RWO) and ReadWriteMany (RWX) volumes.
robin-repl-3—A StorageClass that uses three replicas for data reliability.
robin-immediate—A StorageClass that creates a volume when a Persistent Volume Claim (PVC) is created. It does not wait for the first consumer of that volume.

File collections

The platform uses file collections to store application bundles, images, and collected logs. File collections are exposed through the Symcloud file server, which runs on the control plane nodes. Storage for each file collection is allocated from Symcloud managed storage as a volume.

Kubernetes CSI storage

The platform can use any storage system that has a CSI compatible driver. This storage can be used for ephemeral and persistent storage exactly like Symcloud Storage. However, the backup, snapshot, and migration capabilities of the platform are only available for applications using Symcloud Storage.

Lakehouse storage

The platform has two types of lakehouse storage available, using:

PowerScale with the HDFS protocol
ECS with the S3 protocol

These storage systems are managed and scaled independently from the core platform, providing a decoupled storage and compute architecture.

Applications can connect to either or both lakehouse storage options directly over the network from the application-level code:

The Apache Hadoop client libraries (hadoop-hdfs-client) provide the hdfs:// protocol.
The Apache Hadoop AWS libraries (hadoop-aws) provide the s3a:// protocol.

Depending on the application and its implementation, the images and application bundle may require that the necessary libraries be included. The application must handle authentication to any external storage.

Management and administration

All system administration and management are handled through a single web interface. Operations can also be performed and automated through the command line.

The Symcloud Platform offers integrated multitenancy to enable a shared cloud experience, with physical and logical separation between tenants. It includes integrated support for role-based access control to manage end-user credentials across each tenant. It also includes support for chargeback features to enable multiple departments and use cases.

The platform also includes highly efficient monitoring and metrics collection of hardware components, pods, and applications. These features provide holistic observability of all platform activity in a single location.

Network

The platform uses software defined networking (SDN) to handle communications for the cluster. SDN includes container-to-container connections, pod-to-pod connections, ingress to pod connections, and pod to external services like PowerScale and ECS. These network services are provided through the Kubernetes Container Network Interface (CNI) and supported by CNI compatible network plugins.

Each physical node in the cluster is assigned an IP address for its connection to the Cluster data network. Each connection uses a pair of physical network ports that are bonded with IEEE 802.3ad dynamic link aggregation. This configuration provides both load balancing across physical links, and fault tolerance if a link fails. All networking above this layer is software-defined and uses IP addresses that are private to the cluster.

When applications running on the cluster request IP addresses, the addresses are allocated from the tenants IP pool. The IP pool specifies both the range of addresses available and the CNI driver to use. The platform supports three CNI driver options – Calico, Open vSwitch (OVS), and SR-IOV. Dell Technologies recommends using the OVS driver for most use cases, since it provides the best support for inbound access to applications from outside the cluster.

Inbound network access to applications is handled through a Kubernetes NodePort service. The NodePort configuration is specified in the application bundle and includes the external and internal port number mapping. Outbound network access from applications is handled through IP routing that is based on the destination.

Your Browser is Out of Date

Container platform implementation