LLaMA 7B GPU Memory Requirement

nielsr · June 24, 2023, 8:31am

in full precision (float32), every parameter of the model is stored in 32 bits or 4 bytes. Hence 4 bytes / parameter * 7 billion parameters = 28 billion bytes = 28 GB of GPU memory required, for inference only. In half precision, each parameter would be stored in 16 bits, or 2 bytes. Hence you would need 14 GB for inference. There are now also 8 bit and 4 bit algorithms, so with 4 bits (or half a byte) per parameter you would need 3.5 GB of memory for inference. However usually there’s also some additional overhead as you generate tokens, see this nice blog post: Calculating GPU memory for serving LLMs | Substratus.AI.

For training, it depends on the optimizer you use and whether you use full fine-tuning vs. PEFT (e.g. QLoRa).

In case you use regular AdamW, then you need 8 bytes per parameter (as it not only stores the parameters, but also their gradients and second order gradients). Hence, for a 7B model you would need 8 bytes per parameter * 7 billion parameters = 56 GB of GPU memory. If you use AdaFactor, then you need 4 bytes per parameter, or 28 GB of GPU memory. With the optimizers of bitsandbytes (like 8 bit AdamW), you would need 2 bytes per parameter, or 14 GB of GPU memory.

In case you use parameter-efficient methods like QLoRa, memory requirements are greatly reduced: Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA. Basically one quantizes the base model in 8 or 4 bits and then train adapters on top in float16.

I highly recommend this guide: Methods and tools for efficient training on a single GPU which goes over all of this in much more detail.

Topic		Replies	Views
Memory requierements Models	2	391	February 18, 2025
Hardware Requirement GPU Beginners	3	1186	January 27, 2025
Llama 3.1 8b Instruct - Memory Usage More than Reported Models	5	471	February 18, 2025
LLaMA2 7B uses > 128 GB of GPU Ram and fails with OOM or Loss Scale Minimum 🤗Transformers	3	5563	August 17, 2023
Llama 3.1 70-B run on 32 GB Vram? 🤗Transformers	5	3791	September 20, 2024

LLaMA 7B GPU Memory Requirement

Related topics