Some question about training_step Function in Class Trainer

satwoman · September 24, 2024, 11:23am

I’m analyzing the Class Trainer source code and have a question regarding the training_step function. In this function, it appears that the backward pass is performed before the function returns the loss. Specifically, after calculating the loss, the code calls scaled_loss.backward() (or self.accelerator.backward(loss)) before returning loss.detach() / self.args.gradient_accumulation_steps.

Does this mean that the function first performs backpropagation on the computed loss and then divides the loss by self.args.gradient_accumulation_steps before returning it? How does this fit into the gradient accumulation strategy?

The source code is very short. I hope you can help me take a look.
Thanks for your help!
Link to the source code of the training_step function.

Topic		Replies	Views
Question about training_step Function in Class Trainer 🤗Transformers	0	39	September 23, 2024
Bug in gradient accumulation training_step in huggingface Trainer? 🤗Transformers	3	806	November 2, 2024
Is there a way to backpropagate through multiple steps while using Trainer API 🤗Transformers	1	250	July 9, 2021
Trainer's step loss always drops sharply after each epoch regardless of model / data 🤗Transformers	3	2165	March 28, 2023
Trainer doesn't show the loss at each step 🤗Transformers	20	35363	May 9, 2024

Some question about training_step Function in Class Trainer

Related topics