GPU OOM when training

Emanuel · October 20, 2021, 6:59pm

Thank you, Bram! You are totally right, a problematic data sample of extremely large size caused this issue. Also, my batch size didn’t fit with the max_seq_length, when I used padding="max_length" I got an OOM on the first batch, which was expected. I reduced my batch size (sad) and trimmed the samples to tokenizer.model_max_length.

Topic		Replies	Views
OOM Issues fine-tune DialogGPT-small Beginners	0	263	February 20, 2023
Always getting RuntimeError: CUDA out of memory with Trainer 🤗Transformers	10	6959	April 4, 2024
CUDA out of memory for Longformer Beginners	6	1271	October 22, 2021
OOM GPU when extracting features into dict according to fine-tuning documentation Beginners	0	374	December 6, 2021
CUDA OOM in the course `Fine-tune a model with GRPO` Course	2	196	March 9, 2025

GPU OOM when training

Related topics