Using hyperparameter-search in Trainer

You can pass those parameters later on. IMO it’s easier to evoke get_model() without any parameters, and and then, having instantiated Trainer instance with get_model() passed as model_init argument, you can define parameters you want to tune and pass it as hp_space param of .parameter_search() method.

Below is a snippet that I used to tune hyperparams, hope you’ll find it useful.

config = RobertaConfig.from_pretrained(config_path)

def get_model():
    return RobertaForQuestionAnswering.from_pretrained(
        model_path, config=config)

training_args = TrainingArguments(...)

trainer = Trainer(
    model_init=get_model(),
    args=training_args,
    ...
)

# now this is where you can define your hyperparam space for Tune
tune_config = {
    "lr": tune.uniform(1e-5, 5e-5),
    "weight_decay": tune.choice([0.1, 0.2, 0.3])
}

# and/or, if using scheduler
scheduler = PopulationBasedTraining(
    metric="acc",
    mode="max",
    hyperparam_mutations={
        "per_device_train_batch_size": tune.choice([16, 32, 64, 128]),
        ...
    }
)

#finally
trainer.hyperparameter_search(
    hp_space=lambda _: tune_config,
    backend="ray",
    scheduler=scheduler
)
3 Likes