Is there a way to backpropagate through multiple steps while using Trainer API

rlian · July 9, 2021, 2:53pm

I was wondering if there’s a way to backpropagate the accumulation of loss through multiple optimizer steps while using Trainer API? Since it’s easy to get cuda out of memory and would like to avoid it.

sgugger · July 9, 2021, 3:06pm

If you are talking about gradient accumulation, you can set it with gradient_accumulation_steps=xxx in your TrainingArguments.

Topic		Replies	Views
Bug in gradient accumulation training_step in huggingface Trainer? 🤗Transformers	3	862	November 2, 2024
Question about training_step Function in Class Trainer 🤗Transformers	0	40	September 23, 2024
Some question about training_step Function in Class Trainer Beginners	0	17	September 24, 2024
Custom gradient accumulation scheme in Trainer 🤗Transformers	0	334	June 23, 2023
How is it possible to get GPU memory errors when increasing the gradient_accumulation steps? Intermediate	1	1373	January 22, 2024

Is there a way to backpropagate through multiple steps while using Trainer API

Related topics