Control-plane nodes implement control-plane infrastructure management. Three control-plane nodes establish a unified control plane for the operation of an OpenShift cluster. The control plane operates outside the application container workloads and is responsible for ensuring the overall continued viability, health, availability, and integrity of the container ecosystem.
OpenShift Container Platform also deploys additional control-plane infrastructure to manage OpenShift-specific cluster components.
The control plane provides the following functions:
- API server: The API server exposes the Kubernetes control-plane API for other platform services (such as a web console) to consume and has API endpoints to manage cluster resources.
- Etcd: This highly available and consistent key-value store is used to maintain Kubernetes cluster data. The etcd daemon is run on each control-plane node and requires a majority consensus to achieve a quorum (the formula that is used is , where n is the number of control plane nodes). For production clusters, at least three control-plane nodes are required, each running an etcd daemon. This requirement means that quorum requires at least two control planes.
- Scheduler: The Kubernetes scheduler assigns new pods to a node based on the resource requirements (for example, for CPU, RAM, and GPU) and the affinity and anti-affinity mechanisms.
- Controller manager: The controller managers run all controller processes. While each controller process is independent, the processes are run as a single executable to reduce complexity. The controllers include the node, replication, endpoints, service, and token controllers.
- OpenShift API server: The OpenShift API server validates and configures the data for OpenShift resources such as projects, routes, and templates. The OpenShift API server is managed by the OpenShift API server operator.
- OpenShift controller manager: The OpenShift controller manager watches etcd for changes to OpenShift objects such as project, route, and template controller objects, and then uses the API to enforce the specified state. The OpenShift controller manager is managed by the OpenShift controller manager operator.
- OpenShift OAuth API server: The OpenShift OAuth API server validates and configures the data for OpenShift Container Platform authentication, such as users, groups, and OAuth tokens. The OpenShift OAuth API server is managed by the cluster authentication operator.
- OpenShift OAuth server: Users request tokens from the OpenShift OAuth server to authenticate themselves to the API. The OpenShift OAuth server is managed by the cluster authentication operator.
Backup and disaster recovery
While OpenShift Container Platform is resilient to node failure, regular backups of the etcd data store are recommended. Because etcd backups are a blocking procedure, take them at off-peak hours in production environments. When you update a cluster within minor versions (for example, from 4.10.2 to 4.10.3), you should take an etcd backup of the version of OpenShift Container Platform that is running on your cluster or clusters. Take etcd backups 24 hours after the cluster has been installed to let the initial rotation of certificates occur; otherwise, the etcd backup may contain expired certificates. For more information, see the Red Hat OpenShift document Backing up etcd.
Because of the etcd quorum requirements, if enough control-plane nodes fail (and, as a result, most control planes are no longer operating), restoring from a previous cluster state becomes the only option for cluster recovery. See Restoring to a previous cluster state. If most control-plane nodes are still operating, meaning that quorum can be achieved but no redundancy exists for further node failure, replace unhealthy etcd members. See Replacing an unhealthy etcd member.