- Not to my knowledge. But you can retrieve a log of all the Optuna trials (Hugging Face calls them “runs”), in two different ways.
First way
If you check the Trainer
object after you have called its hyperparameter_search()
method, you will find the log in Trainer.state.log_history
, it’s a list.
Second way
Also, the logs are saved on disk as JSON files called trainer_state.json
every time a checkpoint is saved during the optimization process. You can set the directory to store checkpoints with the output_dir
parameter of TrainingArguments()
; I recommend you also set its logging_strategy
and save_strategy
to 'epoch'
such that you get one checkpoint, and therefore one log saved, at the end of every epoch.
BTW, if you find a way to fetch the whole Study
object, please do let me know.
In addition to the above, you can get very detailed information about what Optuna is doing by enabling its persistent storage on a SQLite database. It is straightforward. To browse the database info I then use DBeaver CE. To enable the database storage, you can pass parameters storage
and load_if_exists
to Trainer.hyperparameter_search()
, e.g.:
res = trainer.hyperparameter_search(hp_space=hp_space,
n_trials=params.fine_tuning.n_trials,
direction='maximize',
compute_objective=compute_objective,
sampler=optuna_sampler,
study_name=study_name,
storage='sqlite:///my_optuna_studies.db',
load_if_exists=True,
pruner=NopPruner())
Those parameters are then forwarded by the Trainer
to optuna.create_study()
, see the Optuna documentation to see their usage.
- Yes, by default, unless you disable it. If you enable persistence in the database with Optuna (as per point above) you can find in the database information about which trials have been pruned, in the
trials
table, state
column.
To disable pruning, you can pass parameter pruner=NopPruner()
to Trainer.hyperparameter_search()
, like I did in the snippet of code at point 1. See Optuna documentation to chose the pruning strategy instead.
There are a few caveats I have stumbled upon, I am still investigating but they may interest you.
a) after hyperparameters search, Trainer.state.best_model_checkpoint
doesn’t contain the path to the best checkpoint saved, but to the last checkpoint saved instead. If I am getting this right, it is a Hugging Face bug.
b) Optuna’s DB persistence should allow you to interrupt and then resume the hyperparameters search. While I have that feature work correctly when using just Optuna, in a toy example, so far I couldn’t make it work with Hugging Face: when I try to resume the hyperparameters search, it restarts from the beginning instead, ignoring trials from the previous experiment. Perhaps I am doing something wrong here.