I’m using hyperparameter_search
for hyperparameter tuning in the following way:
trainer = Trainer(
model_init=model_init,
args=training_args,
train_dataset=train_set,
eval_dataset=dev_set,
tokenizer=tokenizer,
compute_metrics=compute_metrics,
)
best_trial = trainer.hyperparameter_search(
backend="ray",
direction='maximize',
n_trials=10,
)
Everything’s working well and I can see the information for the best trial in the best_trial
. However, my question is how can I save the actual best model from the best trial? I tried saving the model using the trainer’s save_model
like trainer.save_model(path/to/a/folder)
, but I get the following error:
trainer.save_model(path/to/a/folder)
File "/home/ubuntu/anaconda3/envs/ccr/lib/python3.6/site-packages/transformers/trainer.py", line 1885, in save_model
self._save(output_dir)
File "/home/ubuntu/anaconda3/envs/ccr/lib/python3.6/site-packages/transformers/trainer.py", line 1930, in _save
state_dict = self.model.state_dict()
AttributeError: 'NoneType' object has no attribute 'state_dict'
It looks like the trainer does not have the actual best model found as a result of hyperparameter tuning (?). My goal is simple, I basically want to use the best model from hyperparameter tuning to evaluate it on my final test set. But I can’t find a way to save the best model from hyperparameter tuning. Also, someone may say I can get the info from the best trial and fine-tune the model again, but I don’t want to do that and I just simply want to get the model from the hyperparameter tuning. Is there any way to do that? Thanks.