CUDA out of memory error while predicting (evaluation)

sachmatkris · July 16, 2023, 8:34pm

Hello,

I have experimented with fine-tuning ‘facebook/bart-large-mnli’ model. Since it is very big, I barely fit it in the GPU, had to reduce to batch_size = 1 through trainingarguments, but I suceeded at training. However, now that I want to simply produce the outputs with that model, I keep on running into CUDA out of memory errors. I tried out multiple steps but nothing helped. I only found out that if I try to do the predictions on dataset of size 1, it works, but the data is still stored in the GPU despite the fact that I have torch.inference_mode()/torch.no_grad().

It feels like the problem is in the way information is handled by the model by storing it into GPU even though I explicitly try to avoid that. Any suggestions?

Thanks for help, this issue has been bugging me for over a week and I cant find any solutions.

MoritzLaurer · March 22, 2024, 9:14am

had the same problem, found the solution here: CUDA out of memory when using Trainer with compute_metrics - #13 by morenolq

Topic		Replies	Views
CUDA out of memory when using Trainer with compute_metrics 🤗Transformers	25	46444	June 25, 2025
Cuda out of memory during evaluation but training is fine 🤗Transformers	12	17324	February 20, 2025
Cuda memory error even when passing the no_cuda argument 🤗Transformers	0	610	November 23, 2022
CUDA out of memory only during validation not training 🤗Transformers	3	4551	May 9, 2023
torch.cuda.OutOfMemoryError when evaluate while traning 🤗Transformers	0	513	October 8, 2023

CUDA out of memory error while predicting (evaluation)

Related topics