I am using RoBERTa via Trainer.hyperparameter_search
(transformers 3.5.x
) to optimize a couple model params. For some reason, even at very small learning rates (e.g., 5e-6
), my validation_loss
increases epoch-over-epoch.
def hyperparameter_space(trial):
return {
"learning_rate": trial.suggest_float("learning_rate", 1e-6, 1e-3, log=True),
"other_weight": trial.suggest_float("other_weight", 0.1, 1.0, log=False)
}
hp_search_output = my_trainer.hyperparameter_search(
hp_space=hyperparameter_space,
direction="minimize",
study_name = "2021-02-26_roberta_3_epochs",
n_trials=100,
compute_objective=lambda x: x["eval_loss"],
)
Batch size is 16 with a total of ~80,000 input arrays of length 512.
Any suggestions for fixing/investigating this?
FWIW, this is similar to this unanswered question here: