Using hyperparameter-search in Trainer

Could you please tell where that README is ? I checked your recent commits on both trainer_optuna branch and master, didn’t see it.

Sorry, not README, I meant the PR first post.

1 Like

I put a real example now.

What are the pros/cons of optuna VS ray?

Both work with the API. I haven’t used either long enough to have a strong opinion, but basically ray would be better if you have multiple GPUs and optuna might be better with just one, from what I understood.

1 Like

FYI, this has been merged in master. Here is an example of use:

from nlp import load_dataset, load_metric
from transformers import AutoModelForSequenceClassification, AutoTokenizer, DataCollatorWithPadding, Trainer, TrainingArguments

tokenizer = AutoTokenizer.from_pretrained('bert-base-cased')
dataset = load_dataset('glue', 'mrpc')
metric = load_metric('glue', 'mrpc')

def encode(examples):
    outputs = tokenizer(examples['sentence1'], examples['sentence2'], truncation=True)
    return outputs

encoded_dataset = dataset.map(encode, batched=True)
# Won't be necessary when this PR is merged with master since the Trainer will do it automatically
encoded_dataset.set_format(columns=['attention_mask', 'input_ids', 'token_type_ids', 'label'])

def model_init():
    return AutoModelForSequenceClassification.from_pretrained('bert-base-cased', return_dict=True)

def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    predictions = predictions.argmax(axis=-1)
    return metric.compute(predictions=predictions, references=labels)

# Evaluate during training and a bit more often than the default to be able to prune bad trials early.
# Disabling tqdm is a matter of preference.
training_args = TrainingArguments("test", evaluate_during_training=True, eval_steps=500, disable_tqdm=True)
trainer = Trainer(
    args=training_args,
    data_collator=DataCollatorWithPadding(tokenizer),
    train_dataset=encoded_dataset["train"], 
    eval_dataset=encoded_dataset["validation"], 
    model_init=model_init,
    compute_metrics=compute_metrics,
)

# Defaut objective is the sum of all metrics when metrics are provided, so we have to maximize it.
trainer.hyperparameter_search(direction="maximize")

This will use optuna or Ray Tune, depending on which you have installed. If you have both, it will use optuna by default, but you can pass backend="ray" to use Ray Tune. Note that you need an installation from source of nlp to make the example work.

To customize the hyperparameter search space, you can pass a function hp_space to this call. Here is an example if you want to search higher learning rates than the default with optuna:

def my_hp_space(trial):
    return {
        "learning_rate": trial.suggest_float("learning_rate", 1e-4, 1e-2, log=True),
        "num_train_epochs": trial.suggest_int("num_train_epochs", 1, 5),
        "seed": trial.suggest_int("seed", 1, 40),
        "per_device_train_batch_size": trial.suggest_categorical("per_device_train_batch_size", [4, 8, 16, 32, 64]),
    }

trainer.hyperparameter_search(direction="maximize", hp_space=my_hp_space)

and ray:

def my_hp_space_ray(trial):
    from ray import tune

    return {
        "learning_rate": tune.loguniform(1e-4, 1e-2),
        "num_train_epochs": tune.choice(range(1, 6)),
        "seed": tune.choice(range(1, 41)),
        "per_device_train_batch_size": tune.choice([4, 8, 16, 32, 64]),
    }

trainer.hyperparameter_search(direction="maximize", hp_space=my_hp_space)

If you want to customize the objective to minimize/maximize, pass along a function to compute_objective:

def my_objective(metrics):
    # Your elaborate computation here
    return result_to_optimize

trainer.hyperparameter_search(direction="maximize", compute_objective=my_objective)
9 Likes

Thanks. I was following this PR. I wanted to know which type of hyperparams can be tuned with this approach? Does it work with Default ones only (training_args) ? What if we have custom param that we want to tune (for instance a lambda in an objective function) ?

The hyperparams you can tune must be in the TrainingArguments you passed to your Trainer. If you have custom ones that are not in TrainingArguments, just subclass TrainingArguments and add them in your subclass.

The hp_space function indicates the hyperparameter search space (see the code of the default for optuna or Ray in training_utils.py and adapt it to your needs) and the compute_objective function should return the objective to minize/maximize.

2 Likes

Thank you so much! But I have a problem when defining the Trainer. It said, “init() got an unexpected keyword argument ‘model_init’”. Is the Trainer doesn’t recognize the ‘model_init’ argument?

I think this error affect next error when I want to call the ‘hyperparameter_search’ method. It said, “‘Trainer’ object has no attribute ‘hyperparameter_search’”.

What should I do? Very sorry for the very newbie question :pray: and Thankyou before.

This is new so you need an installation from source to use it. It will be in the next release coming soon otherwise.

1 Like

Alright, I’m waiting for it! :rocket:

FYI, You can pip install now to use this feature. No need to build from source.

1 Like

Oh yeah thank you, It seems developed. But I’m still getting problem in hyperparameter_search method. I defined my backend parameter to ‘optuna’ but the error said: You picked the optuna backend, but it is not installed. Use pip install optuna., though I’ve already pip-installed it before the hyperparameter_search code line. The case was same when I defined the backend parameter into ‘ray’. Have I make a mistake? I run my code in Google Colab by the way.

It means that it is not installed in your current environment. If you are using notebooks, you have to restart the kernel. Python needs to reload the libraries to see which ones are available.

2 Likes

Oh yeah it has already worked. Thank you so much! :grin:

I wonder if Sylvain or others might have advice on how to make the hyperparameters search more efficient or manageable, time and resource-wise.

I’ve tried slimming down the dataset (500K rows to 90K rows), reducing the number of parameters to tune (to just 1, number of epochs) and changing the “direction” to “minimize” instead of “maximize”.

Is there something else I can do, aside from further cutting down the size of the dataset? I’m running trials on Colab Pro with GPU/high-RAM enabled, and current version looks like it’ll take about 7 hours (perfectly fine for others I’m sure).

I don’t suppose there’s an equivalent of RandomizedSearchCV for trainer?

Note that you can use pretty much anything in optuna and Ray Tune by just subclassing the Trainer and overriding the proper methods.

1 Like

I’m having some issues with this, under the optuna backend. Here is my hyperparameter space :

def hyperparameter_space(trial):

    return {
        "learning_rate": trial.suggest_float("learning_rate", 1e-6, 1e-4, log=True),
        "per_device_train_batch_size": trial.suggest_categorical("per_device_train_batch_size", [8, 16, 32]),
        "weight_decay": trial.suggest_float("weight_decay", 1e-12, 1e-1, log=True),
        "adam_epsilon": trial.suggest_float("adam_epsilon", 1e-10, 1e-6, log=True)
    }

When I call trainer.hyperparameter_search on this, I find that it varies the number of epochs, too, despite these being fixed in TrainingArguments to 5. The run that’s going now has run 5-epoch trials a few times but now it’s running a 20-epoch trial… Has anyone observed anything like this ?

Thank you very much.

That may be linked to some bug I fixed a few weeks ago with the Trainer modifying its TrainingArguments: it used to change the value of max_steps which would then change the number of epochs for you since you are changing the batch size.

Can you check if you get this behavior on current master?

Hi @sgugger, in case you’re not aware of it, it seems the latest commit on master broke the Colab notebook you shared on Twitter

Trying to run that notebook, I hit the following error when trying to run

best_run = trainer.hyperparameter_search(n_trials=10, direction="maximize")

with the optuna backend.

Stack trace:

/usr/local/lib/python3.6/dist-packages/sklearn/metrics/_classification.py:900: RuntimeWarning:

invalid value encountered in double_scalars

[W 2020-10-22 14:58:41,815] Trial 0 failed because of the following error: RuntimeError("Metrics contains more entries than just 'eval_loss', 'epoch' and 'total_flos', please provide your own compute_objective function.",)
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/optuna/study.py", line 799, in _run_trial
    result = func(trial)
  File "/usr/local/lib/python3.6/dist-packages/transformers/integrations.py", line 114, in _objective
    trainer.train(model_path=model_path, trial=trial)
  File "/usr/local/lib/python3.6/dist-packages/transformers/trainer.py", line 803, in train
    self._maybe_log_save_evaluate(tr_loss, model, trial, epoch)
  File "/usr/local/lib/python3.6/dist-packages/transformers/trainer.py", line 855, in _maybe_log_save_evaluate
    self._report_to_hp_search(trial, epoch, metrics)
  File "/usr/local/lib/python3.6/dist-packages/transformers/trainer.py", line 537, in _report_to_hp_search
    self.objective = self.compute_objective(metrics.copy())
  File "/usr/local/lib/python3.6/dist-packages/transformers/trainer_utils.py", line 120, in default_compute_objective
    "Metrics contains more entries than just 'eval_loss', 'epoch' and 'total_flos', please provide your own compute_objective function."
RuntimeError: Metrics contains more entries than just 'eval_loss', 'epoch' and 'total_flos', please provide your own compute_objective function.
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-26-12c3f54763db> in <module>()
----> 1 best_run = trainer.hyperparameter_search(n_trials=10, direction="maximize")

10 frames
/usr/local/lib/python3.6/dist-packages/transformers/trainer_utils.py in default_compute_objective(metrics)
    118     if len(metrics) != 0:
    119         raise RuntimeError(
--> 120             "Metrics contains more entries than just 'eval_loss', 'epoch' and 'total_flos', please provide your own compute_objective function."
    121         )
    122     return loss

RuntimeError: Metrics contains more entries than just 'eval_loss', 'epoch' and 'total_flos', please provide your own compute_objective function.

I tried passing the dict produced by trainer.evaluate() in the compute_objective arg, but this complains that TypeError: 'dict' object is not callable.

I’d be happy to fix the docs / code if you can give me some pointers on where to start!