Hopefully a quick question!
If I set logging_steps=500 say in the TrainingArguments I get a log line with the ‘loss’, ‘learning_rate’ and ‘epoch’ every 500 steps. I was wondering if the value for the loss is averaged over the 500 mini-batches, or if it is the value computed on the 500th batch?
Thank you!