Saving eval loss for every evaluation/saved checkpoint with Trainer

I am running the trainer with

--do_eval  --report_to none --evaluation_strategy epoch --num_train_epochs 10 

so I am expecting the trainer to evaluate after every epoch and saving those results in either eval_results.json or all_results.json. But those files unfortunately only contain the results of the last evaluation.

It is useful that checkpoints are saved, but I cannot seem to find the evaluation results for each checkpoint. That would be very useful to select the optimal checkpoint.

There will be in the log_history field of the trainer_state, which is also saved in the same folder.

Excellent, thanks!

1 Like