Hello everyone
I am currently training the t5 model for the seq2seq task right now, and I wonder if we can use Optuna to do the hyper-params search for the config like num_layers
OR num_heads
.
According to Mr. Sgugger in this discussion
The hyperparams you can tune must be in the
TrainingArguments
you passed to your Trainer. If you have custom ones that are not inTrainingArguments
, just subclassTrainingArguments
and add them in your subclass.The
hp_space
function indicates the hyperparameter search space (see the code of the default for optuna or Ray intraining_utils.py
and adapt it to your needs) and thecompute_objective
function should return the objective to minize/maximize.
But it was almost 3 years ago, I am wondering there is any method to achieve this now??
Because num_layers
or num_heads
are not in the args of Trainer, I really don’t know how to search for this.