The following figure demonstrates the logical architecture design of the digital assistant deployed on Dell APEX Cloud Platform for Red Hat OpenShift.
Figure 5. Digital assistant logical architecture
In our validated design, we use Llama 2 model for language processing, LangChain to integrate different tools of the LLM-based application together and to process the PDF files and web pages, Redis to store vectors, Caikit to serve the Llama 2 model, Gradio for user interface and object storage to store language model and other datasets. Solution components are deployed as microservices in the Red Hat OpenShift cluster.
The following list details the roles and responsibilities of each microservice.
- Domain-specific document processing microservice: This microservice provides a way to convert PDF files or web pages to vector embeddings using the LangChain library. Vector embeddings are a way of representing text as a vector of numbers, which can be used for semantic search to answer user queries.
- Knowledge base microservice: This microservice provides a scalable and efficient vector store for the digital assistant using Redis, an in-memory data structure store. It enables the digital assistant to perform real-time vector search and retrieval.
- Llama Model serving microservice: This microservice contains a Llama 2 model served using Caikit, an inference server for large language models. The model is also integrated with TGI. TGI enables high-performance text generation for the most popular LLMs.
- UI Microservice: This microservice provides a user-friendly interface for the digital assistant using Gradio, an open-source library for creating UIs for machine learning models. It enables users to interact with the digital assistant in a natural and intuitive way, making it more accessible and engaging. UI microservice also contains a LangChain library to chain together different components such as vector store and Llama model.