I don’t understand the question. With load_best_model_at_end
the model loaded at the end of training is the one that had the best performance on your validation set. So when you save that model, you have the best model on this validation set.
If it’s crap on another set, it means your validation set was not representative of the performance you wanted and there is nothing we can do on Trainer
to fix that.