Cuda OOM Error When Finetuning GPT Neo 2.7B

I’m trying to finetune the 2.7B model with some data I gathered. I’m running on Google Colab Pro with a T-100 16GB. When I run:

from happytransformer import HappyGeneration, GENTrainArgs
model= HappyGeneration("GPT-NEO", "EleutherAI/gpt-neo-2.7B")
args = GENTrainArgs(num_train_epochs = 1, learning_rate =1e-5)
model.train("file.txt", args=args)

I get this error

RuntimeError: CUDA out of memory. Tried to allocate 80.00 MiB (GPU 0; 15.90 GiB total capacity; 14.94 GiB already allocated; 61.75 MiB free; 14.96 GiB reserved in total by PyTorch)

I’m also getting this warning so it could have to do with the problem:

Token indices sequence length is longer than the specified maximum sequence length for this model (2337 > 2048). Running this sequence through the model will result in indexing errors

Does anyone know why I’m getting this error?

Check Keep getting CUDA OOM error with Pytorch failing to allocate all free memory - PyTorch Forums for the pytorch part of it.
I am seeing something similar for XLM, but in my case pytorch config override are not getting recognized. Need to check if hugging face is overriding it internally.