LLaMA 7B GPU Memory Requirement

To run the 7B model in full precision, you need 7 * 4 = 28GB of GPU RAM. You should add torch_dtype=torch.float16 to use half the memory and fit the model on a T4.

15 Likes