How to save the best trial's model using `trainer.hyperparameter_search`

I’m using hyperparameter_search for hyperparameter tuning in the following way:

trainer = Trainer(
            model_init=model_init,
            args=training_args,
            train_dataset=train_set,
            eval_dataset=dev_set,
            tokenizer=tokenizer,
            compute_metrics=compute_metrics,
)

best_trial = trainer.hyperparameter_search(
            backend="ray",
            direction='maximize',
            n_trials=10,
       )

Everything’s working well and I can see the information for the best trial in the best_trial. However, my question is how can I save the actual best model from the best trial? I tried saving the model using the trainer’s save_model like trainer.save_model(path/to/a/folder), but I get the following error:

trainer.save_model(path/to/a/folder)
  File "/home/ubuntu/anaconda3/envs/ccr/lib/python3.6/site-packages/transformers/trainer.py", line 1885, in save_model
    self._save(output_dir)
  File "/home/ubuntu/anaconda3/envs/ccr/lib/python3.6/site-packages/transformers/trainer.py", line 1930, in _save
    state_dict = self.model.state_dict()
AttributeError: 'NoneType' object has no attribute 'state_dict'

It looks like the trainer does not have the actual best model found as a result of hyperparameter tuning (?). My goal is simple, I basically want to use the best model from hyperparameter tuning to evaluate it on my final test set. But I can’t find a way to save the best model from hyperparameter tuning. Also, someone may say I can get the info from the best trial and fine-tune the model again, but I don’t want to do that and I just simply want to get the model from the hyperparameter tuning. Is there any way to do that? Thanks.

1 Like

@sgugger Any thoughts on this?

There is no automatic process right now. If you set save_strategy="epoch" and save_total_limit=1, you will have a save of the model for each trial and you should be able to access it at the end by looking at checkpoint-{trail_id}-xxx.

We’ll put having it being automatic on the roadmap so it becomes easier in a future version! Hoping to have some time to do this next week or the week after.

1 Like