Hyperparameter optimization and load_best_model_at_end

When optimizing hyperparameters, in the scope of a particular trial, is there a way to return the “best” model as in the load_best_model_at_end option? In other words, rather than optimizing the final metric of the trial, I’d like to optimize the best value of the metric observed in the trial, even if that was not at the end.

To be very specific, let’s say my accuracy metric looks like this for a trial:

0.5, 0.8. 0.9, 0.85

I’d like to return the model with 0.9 accuracy to the optimizer, not 0.85. (This is particularly useful with early stopping)

I’m currently using the wandb backend, and although I have load_best_model_at_end enabled, wandb is always seeing the most recent model.

Hello, @ejschwartz.
My go-to in this kind of situation is to add a “best_metric” key to the eval dictionary output.
This metric should be updated at each eval as the max between the current value and the new one to keep memory of the best metric obtained during the trial.

Let’s take an example:
If I’m doing a multiclass classification and I want to select my best model as the one which maximizes micro_avg_f1-score for each trial, among all values obtained during the trial.

I’ll modify my compute_metrics function to add a “best_metric” key, computed as the max between the current value (initialized at 0.0) and the current eval value for my metric of choice (here micro_avg_f1-score).

Then my compute_objective function should be as simple as:
return metrics[“eval_best_metric”]

Does that help you ?

Thanks, this seems like it should work. I’ll give it a try.