Home > AI Solutions > Gen AI > White Papers > HR-Assist: Proof of Concept RAG-Based Matching Assistant > Software design
The Mistral model is quantized to 4-bit for efficiency. This quantization reduces the model's computational and memory footprint while enabling lower resource consumption.
Component | Specification | Details |
Servers | Desktop workstations and High-performance servers | Equipped with NVIDIA GPUs for AI and ML workloads |
Storage | High-speed SSDs | Used for data storage and quick data retrieval |
Networking | High-capacity, low-latency | Ensures seamless data transfer and system connectivity |
Operating System | Linux-based with Ubuntu 20.04 LTS | Provides stability and performance for AI tasks |
Large Language Model | Mistral 7B Instruct v2 from Hugging Face | Handles natural language understanding and processing |
Vector Database | Chroma db | Manages vectorized representations of the data |
Chain | Langchain retrievalQAchain, Huggingface pipeline | Processes and retrieves information |
User Interface | Gradio interface (non-block version) | Provides an interactive platform for user interaction |
Quantization | 4-bit quantization | Enhances efficiency by reducing model size |
Workloads Supported | CSV, PPT, PDF | Flexible data handling capabilities |