Clarification on push_to_hub, best model, and model card


I’m trying to solve a doubt that I saw a few other people have asked about, but couldn’t find an answer.

The question is quite simple: when using trainer.push_to_hub() together with the load_best_model_at_end argument, is the trainer pushing the last or the best model?

My doubt is raised by the fact that the automatically-created model card reports the selected score metric for the last epoch, instead of the one obtained at the best one.

I am reasonably sure is not an issue of loading the best model, since I’ve tested via trainer.evaluate() that after training the trainer is, in fact using the best-obtained model.

Thank you so very much for any help/clarification, and the great work you’re all doing!


Support your question!
I’ve checked my model after pushing it to Hub and it is true that the trainer pushes the best model.
However, the automatically-created model card is indeed misleading, reporting the wrong (last) score and number of training steps.

I think this issue should be addressed by the team!


Would be nice to have this fixed, now that several courses offered by HF are relying on the metrics reported on the automatically created model cards

1 Like