You can pass those parameters later on. IMO it’s easier to evoke get_model()
without any parameters, and and then, having instantiated Trainer instance with get_model()
passed as model_init
argument, you can define parameters you want to tune and pass it as hp_space
param of .parameter_search()
method.
Below is a snippet that I used to tune hyperparams, hope you’ll find it useful.
config = RobertaConfig.from_pretrained(config_path)
def get_model():
return RobertaForQuestionAnswering.from_pretrained(
model_path, config=config)
training_args = TrainingArguments(...)
trainer = Trainer(
model_init=get_model(),
args=training_args,
...
)
# now this is where you can define your hyperparam space for Tune
tune_config = {
"lr": tune.uniform(1e-5, 5e-5),
"weight_decay": tune.choice([0.1, 0.2, 0.3])
}
# and/or, if using scheduler
scheduler = PopulationBasedTraining(
metric="acc",
mode="max",
hyperparam_mutations={
"per_device_train_batch_size": tune.choice([16, 32, 64, 128]),
...
}
)
#finally
trainer.hyperparameter_search(
hp_space=lambda _: tune_config,
backend="ray",
scheduler=scheduler
)