Home > AI Solutions > Artificial Intelligence > White Papers > Llama 2: Inferencing on a Single GPU > Download the Llama 2 Model
The model is available on Hugging Face. For Llama 2 model access we completed the required Meta AI license agreement.
The memory consumption of the model on our system is shown in the following table.
Model | Model Precision | No. of GPUs used | GPU memory consumed | Platform |
Llama 2-7B-chat | FP-16 | 1 x A100-40GB | 14.08 GiB | PowerEdge R760xa |