Clarification on push_to_hub, best model, and model card

Hi,

I’m trying to solve a doubt that I saw a few other people have asked about, but couldn’t find an answer.

The question is quite simple: when using trainer.push_to_hub() together with the load_best_model_at_end argument, is the trainer pushing the last or the best model?

My doubt is raised by the fact that the automatically-created model card reports the selected score metric for the last epoch, instead of the one obtained at the best one.

I am reasonably sure is not an issue of loading the best model, since I’ve tested via trainer.evaluate() that after training the trainer is, in fact using the best-obtained model.

Thank you so very much for any help/clarification, and the great work you’re all doing!

3 Likes

Support your question!
I’ve checked my model after pushing it to Hub and it is true that the trainer pushes the best model.
However, the automatically-created model card is indeed misleading, reporting the wrong (last) score and number of training steps.

I think this issue should be addressed by the team!

2 Likes

Would be nice to have this fixed, now that several courses offered by HF are relying on the metrics reported on the automatically created model cards

1 Like