OPT Memory problem

lexipalmer13 · May 28, 2022, 4:05am

Hi!

I’m trying to replicate the basic OPT examples from the documentation and I keep getting a CUDA is out of memory error. I tried using low_cpu_mem_usage=True since that has been a solution on other models, but it doesn’t make a difference.

Code is basic:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(“facebook/opt-30b”, torch_dtype=torch.float16).cuda()

And an example error is:

I know it’s a large model, but pytorch reserving 43 GiB seems high. All of the solutions I can find on outside forums wouldn’t seem to work with this model type (running in smaller batches, clearing memory mid run, or using koila wrappers). Any help much appreciated!

sgugger · May 31, 2022, 2:47pm

The facebook/opt-30b model takes 60GB of memory in FP16 precision.

cog · June 2, 2022, 2:51am

hi.

if your machine install multiple GPUs, load model with Data Parallel or Distributed Data parallel shall be help.

regards.

Topic		Replies	Views
Facebook/opt-30b model inferencing Models	3	2673	January 19, 2023
Running inference on OPT 30m on GPU Beginners	2	2271	May 18, 2022
Fine-tune OPT 13B: CUDA out of memory error (720gb vram, batch size 1, fp16)! Beginners	6	4575	July 25, 2022
Training out of memory 🤗Transformers	0	226	July 18, 2024
Cuda out of memory error Intermediate	11	42155	January 27, 2025

OPT Memory problem

Related topics