Software design

Thank you for your feedback!

Quantization

The Mistral model is quantized to 4-bit for efficiency. This quantization reduces the model's computational and memory footprint while enabling lower resource consumption.

Table 3. Detailed configurations of the hardware and software components used in the HR-Assist chatbot solution

Component	Specification	Details
Servers	Desktop workstations and High-performance servers	Equipped with NVIDIA GPUs for AI and ML workloads
Storage	High-speed SSDs	Used for data storage and quick data retrieval
Networking	High-capacity, low-latency	Ensures seamless data transfer and system connectivity
Operating System	Linux-based with Ubuntu 20.04 LTS	Provides stability and performance for AI tasks
Large Language Model	Mistral 7B Instruct v2 from Hugging Face	Handles natural language understanding and processing
Vector Database	Chroma db	Manages vectorized representations of the data
Chain	Langchain retrievalQAchain, Huggingface pipeline	Processes and retrieves information
User Interface	Gradio interface (non-block version)	Provides an interactive platform for user interaction
Quantization	4-bit quantization	Enhances efficiency by reducing model size
Workloads Supported	CSV, PPT, PDF	Flexible data handling capabilities

Your Browser is Out of Date

Software design

Software design

Quantization