Your Browser is Out of Date

Nytro.ai uses technology that works best in other browsers.
For a full experience use one of the browsers below

Dell.com Contact Us

United States/English

Home Workload Solutions Artificial Intelligence White Papers

Llama 2: Inferencing on a Single GPU

This document describes how to deploy and run inferencing on a Meta Llama 2 7B parameter model using a single NVIDIA A100 GPU with 40GB memory.

Thank you for your feedback!