This section describes the procedure to deploy Gradio to provide a user interface for a digital assistant.
This is a simple UI example for a RAG-based Chatbot using Gradio, Caikit + TGIS for LLM inference, and Redis as a vector database.
Deployment requirements include:
A pre-built container image of the application is available at: quay.io/rh-aiservices-bu/gradio-caikit-rag-redis:latest
The deployment folder includes the necessary files necessary to deploy the application:
The different parameters you must pass as environment variables in the deployment are:
INFERENCE_SERVER_URL - mandatory REDIS_URL - mandatory REDIS_INDEX - mandatory MAX_NEW_TOKENS - optional, default: 512 MIN_NEW_TOKENS - optional, default: 100
Create a new project in OpenShift cluster using the following command:
oc new-project dell-digital-assistant
Clone dell digital assistant GitHub repository and go to:
git clone https://github.com/DellBizApps/dell-digital-assistent.git cd dell-digital-assistent/05-UI/Gradio/caikit-rag-redis/ oc apply -f deployment/
Note: You can also deploy the application in OpenShift dashboard by using the source to image feature of the OpenShift cluster by following the instructions available on the Red Hat Customer Portal.
The deployment replicas are initially set to 0 to let you properly fill in the environment variables based on your setup. Change the values accordingly and scale the deployment replicas from 0 to 1.