Cuda OOM Error When Finetuning GPT Neo 2.7B

nikhilnayak · May 21, 2021, 9:13pm

I’m trying to finetune the 2.7B model with some data I gathered. I’m running on Google Colab Pro with a T-100 16GB. When I run:

from happytransformer import HappyGeneration, GENTrainArgs
model= HappyGeneration("GPT-NEO", "EleutherAI/gpt-neo-2.7B")
args = GENTrainArgs(num_train_epochs = 1, learning_rate =1e-5)
model.train("file.txt", args=args)

I get this error

RuntimeError: CUDA out of memory. Tried to allocate 80.00 MiB (GPU 0; 15.90 GiB total capacity; 14.94 GiB already allocated; 61.75 MiB free; 14.96 GiB reserved in total by PyTorch)

I’m also getting this warning so it could have to do with the problem:

Token indices sequence length is longer than the specified maximum sequence length for this model (2337 > 2048). Running this sequence through the model will result in indexing errors

Does anyone know why I’m getting this error?

Abhishek-P · November 9, 2021, 8:24pm

Check Keep getting CUDA OOM error with Pytorch failing to allocate all free memory - PyTorch Forums for the pytorch part of it.
I am seeing something similar for XLM, but in my case pytorch config override are not getting recognized. Need to check if hugging face is overriding it internally.

Topic		Replies	Views
Solving "CUDA out of memory" when fine-tuning GPT-2 🤗Transformers	0	1410	January 6, 2022
Always getting RuntimeError: CUDA out of memory with Trainer 🤗Transformers	10	6920	April 4, 2024
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 39.56 GiB total capacity; 37.84 GiB already allocated; 242.56 MiB free; 37.96 GiB reserved in total by PyTorch) 🤗Transformers	2	5362	June 7, 2023
Google/gemma-2-2b-it Crashes in Google colab Models	0	52	September 5, 2024
Colab CUDA OOM using Llama-2-7b-chat-hf even with 40GPU RAM 🤗Transformers	0	906	December 29, 2023

Cuda OOM Error When Finetuning GPT Neo 2.7B

Related topics