Hyperparameter optimization and load_best_model_at_end

ejschwartz · June 22, 2023, 12:55pm

When optimizing hyperparameters, in the scope of a particular trial, is there a way to return the “best” model as in the load_best_model_at_end option? In other words, rather than optimizing the final metric of the trial, I’d like to optimize the best value of the metric observed in the trial, even if that was not at the end.

To be very specific, let’s say my accuracy metric looks like this for a trial:

0.5, 0.8. 0.9, 0.85

I’d like to return the model with 0.9 accuracy to the optimizer, not 0.85. (This is particularly useful with early stopping)

I’m currently using the wandb backend, and although I have load_best_model_at_end enabled, wandb is always seeing the most recent model.

rvienne · June 22, 2023, 5:51pm

Hello, @ejschwartz.
My go-to in this kind of situation is to add a “best_metric” key to the eval dictionary output.
This metric should be updated at each eval as the max between the current value and the new one to keep memory of the best metric obtained during the trial.

Let’s take an example:
If I’m doing a multiclass classification and I want to select my best model as the one which maximizes micro_avg_f1-score for each trial, among all values obtained during the trial.

I’ll modify my compute_metrics function to add a “best_metric” key, computed as the max between the current value (initialized at 0.0) and the current eval value for my metric of choice (here micro_avg_f1-score).

Then my compute_objective function should be as simple as:
return metrics[“eval_best_metric”]

Does that help you ?

ejschwartz · June 22, 2023, 6:13pm

Thanks, this seems like it should work. I’ll give it a try.

Topic		Replies	Views
Unexpected behavior of load_best_model_at_end in Trainer (or am I doing it wrong?) 🤗Transformers	2	62	March 25, 2025
Load_best_model_at_end doesn't work? 🤗Transformers	1	110	March 25, 2025
How to save the best trial's model using `trainer.hyperparameter_search` 🤗Transformers	6	2699	July 8, 2025
How to load the best model based on loss and eval_loss 🤗Transformers	0	1209	February 2, 2022
[Ray] How to get the best model per trial 🤗Transformers	1	529	November 18, 2021

Hyperparameter optimization and load_best_model_at_end

Related topics