cnvrg.io provides an enterprise ready MLOps platform for data scientists and ML engineers to develop and publish AI applications. It is a Kubernetes-based deployment that includes container images with prebuilt deployment templates and default configurations.
The following figure shows how cnvrg.io is deployed on Kubernetes:
Figure 1. Software architecture for MLOps using cnvrg.io
As a containerized application, cnvrg.io uses Kubernetes capabilities to manage the life cycle of the deployment, while providing improved availability and scalability. The cnvrg.io platform consists of control plane pods and worker pods. The control plane pods consist of all the pods in the cnvrg.io management plane, such as application server, web server PostgreSQL, Fluent Bit, and Sidekiq. Note that the cnvrg.io control plane is different than the Kubernetes control plane. References to the control plane in this document specify the cnvrg.io control plane.
As discussed earlier, cnvrg.io deployment consists of a control plane that includes components that manage the deployment along with worker nodes where AI workloads run. This modular deployment allows administrators to size, manage, and operate the control plane independently of the worker nodes. Administrators can seamlessly update cnvrg.io to the latest version and make new features available to the worker nodes with minimal impact.
In this design guide we focus on the following types of scaling:
When the workload for a particular component (such as the Sidekiq job scheduler) increases, increasing resource use such as the average CPU use or average memory use, new control plane pods of that component are automatically deployed to handle the increased workload. If the load decreases, and the number of pods is above the configured minimum, the HPA instructs the workload resource to scale down.
The HPA is implemented as a Kubernetes API resource and a controller. The resource determines the behavior of the controller. The horizontal pod autoscaling controller, running in the Kubernetes control plane, periodically adjusts the wanted scale of its target (for example, a deployment) to match observed metrics such as average CPU use, average memory use, or any custom metric that you specify.
There are three options to make AI data available to workspaces in cnvrg.io:
cnvrg.io uses ingress control and load balancers to govern access to deployment. cnvrg.io requires a DNS wildcard subdomain record, which resolves the ingress IP address to the cnvrg.io cluster domain, for example, *.cnvrg.my-org.com -> 172.20.13.42. Istio allocates the subdomain to the different components of cnvrg.io and for new workspaces and endpoints.