Home > Servers > Specialty Servers > White Papers > Deploy GenAI on the PowerEdge XE9680 with Intel® Gaudi®3 Accelerators > Infrastructure deployment
Kubespray is an open-source tool that automates Kubernetes cluster deployments using Ansible playbooks. The current setup involves deploying all services on a single Dell XE9680 node. Additional benefits can be achieved using a production-level deployment.
Set up the deployment host:
sudo apt-get update
sudo apt-get install python3 python3-venv
git clone https://github.com/kubernetes-sigs/kubespray.git
VENVDIR=kubespray-venv
KUBESPRAYDIR=kubespray
python3 -m venv $VENVDIR
source $VENVDIR/bin/activate
cd $KUBESPRAYDIR
git checkout tags/v2.25.0 -b v2.25.0 # or latest
pip install -U -r requirements.txt
Copy and configure the inventory with target nodes:
cp -rfp inventory/sample inventory/mycluster
Generate the inventory hosts.yml file based on the number of hosts you have:
declare -a IPS=(192.168.1.1, your_ips_list) && CONFIG_FILE=inventory/mycluster/hosts.yml python3 contrib/inventory_builder/inventory.py ${IPS[@]}
Validate the inventory files:
cat inventory/mycluster/hosts.yml
cat inventory/mycluster/group_vars/all/all.yml
cat inventory/mycluster/group_vars/k8s_cluster/addons.yml
Enable add-ons helm_enabled, registry_enabled in the following file:
inventory/mycluster/group_vars/k8s_cluster/addons.yml
Run the main Ansible playbook in $KUBESPRAYDIR:
ansible-playbook -i inventory/mycluster/hosts.yml cluster.yml -b
Monitor deployment progress in the console output.
Kubectl commands have been added to check node status and validate core components:
sudo kubectl get nodes
sudo kubectl get pods -n kube-system
This includes commands for configuring kubectl access and an example of setting up Prometheus and Grafana for monitoring:
Configure kubectl access:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Re-validate kubectl access:
kubectl get nodes
(Optional) Change registry volume by running:
kubectl -n kube-system edit rs registry
Change the registry volume to hostpath by changing the following:
volumes:
- hostPath:
path: /scratch-1/registry
type: ""
name: registry-pvc
To recreate the registry, delete previous registry pod:
kubectl get pods --all-namespaces
kubectl delete -n kube-system pod registry-xkfft
(Optional) Deploy registry:
kubectl create -f - <<EOF
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: registry-proxy
namespace: kube-system
labels:
k8s-app: registry-proxy
version: v0.4
spec:
selector:
matchLabels:
k8s-app: registry-proxy
version: v0.4
template:
metadata:
labels:
k8s-app: registry-proxy
kubernetes.io/name: "registry-proxy"
version: v0.4
spec:
containers:
- env:
- name: REGISTRY_HOST
value: registry.kube-system.svc.cluster.local
- name: REGISTRY_PORT
value: "5000"
image: gcr.io/google_containers/kube-registry-proxy:0.4
imagePullPolicy: IfNotPresent
name: registry-proxy
ports:
- containerPort: 80
hostPort: 5000
name: registry
protocol: TCP
EOF
Validate that all pods are running, including the registry.
kubectl get pods --all-namespaces
(Optional) Set up monitoring and logging:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts && helm repo update
helm install prometheus prometheus-community/kube-prometheus-stack \
--version 54.2.2 \
--namespace monitoring \
--create-namespace \
--set grafana.adminPassword=changeme \
--set prometheus.service.nodePort=30901 \
--set prometheus.service.type=NodePort