I’m trying to finetune the 2.7B model with some data I gathered. I’m running on Google Colab Pro with a T-100 16GB. When I run:
from happytransformer import HappyGeneration, GENTrainArgs
model= HappyGeneration("GPT-NEO", "EleutherAI/gpt-neo-2.7B")
args = GENTrainArgs(num_train_epochs = 1, learning_rate =1e-5)
model.train("file.txt", args=args)
I get this error
RuntimeError: CUDA out of memory. Tried to allocate 80.00 MiB (GPU 0; 15.90 GiB total capacity; 14.94 GiB already allocated; 61.75 MiB free; 14.96 GiB reserved in total by PyTorch)
I’m also getting this warning so it could have to do with the problem:
Token indices sequence length is longer than the specified maximum sequence length for this model (2337 > 2048). Running this sequence through the model will result in indexing errors
Does anyone know why I’m getting this error?