Home > Workload Solutions > Data Analytics > White Papers > Scale AI Training and Fine-Tuning with Dell PowerScale and PowerEdge Servers > Parallelism
Following are the tensor and pipeline parallelism values we used for the models:
Model | Configuration |
Llama 2 7B | Tensor Parallelism = 2 Pipeline Parallelism = 1 Micro batch size = 1 Global batch size = 144 Sequence length = 4096 |
Llama 2 70B | Tensor Parallelism = 4 Pipeline Parallelism = 4 Micro batch size = 1 Global batch size = 144 Sequence length = 4096 |