Using hyperparameter-search in Trainer

Note that you can use pretty much anything in optuna and Ray Tune by just subclassing the Trainer and overriding the proper methods.

I’m having some issues with this, under the optuna backend. Here is my hyperparameter space :

def hyperparameter_space(trial):

    return {
        "learning_rate": trial.suggest_float("learning_rate", 1e-6, 1e-4, log=True),
        "per_device_train_batch_size": trial.suggest_categorical("per_device_train_batch_size", [8, 16, 32]),
        "weight_decay": trial.suggest_float("weight_decay", 1e-12, 1e-1, log=True),
        "adam_epsilon": trial.suggest_float("adam_epsilon", 1e-10, 1e-6, log=True)

When I call trainer.hyperparameter_search on this, I find that it varies the number of epochs, too, despite these being fixed in TrainingArguments to 5. The run that’s going now has run 5-epoch trials a few times but now it’s running a 20-epoch trial… Has anyone observed anything like this ?

Thank you very much.

That may be linked to some bug I fixed a few weeks ago with the Trainer modifying its TrainingArguments: it used to change the value of max_steps which would then change the number of epochs for you since you are changing the batch size.

Can you check if you get this behavior on current master?

Hi @sgugger, in case you’re not aware of it, it seems the latest commit on master broke the Colab notebook you shared on Twitter

Trying to run that notebook, I hit the following error when trying to run

best_run = trainer.hyperparameter_search(n_trials=10, direction="maximize")

with the optuna backend.

Stack trace:

/usr/local/lib/python3.6/dist-packages/sklearn/metrics/ RuntimeWarning:

invalid value encountered in double_scalars

[W 2020-10-22 14:58:41,815] Trial 0 failed because of the following error: RuntimeError("Metrics contains more entries than just 'eval_loss', 'epoch' and 'total_flos', please provide your own compute_objective function.",)
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/optuna/", line 799, in _run_trial
    result = func(trial)
  File "/usr/local/lib/python3.6/dist-packages/transformers/", line 114, in _objective
    trainer.train(model_path=model_path, trial=trial)
  File "/usr/local/lib/python3.6/dist-packages/transformers/", line 803, in train
    self._maybe_log_save_evaluate(tr_loss, model, trial, epoch)
  File "/usr/local/lib/python3.6/dist-packages/transformers/", line 855, in _maybe_log_save_evaluate
    self._report_to_hp_search(trial, epoch, metrics)
  File "/usr/local/lib/python3.6/dist-packages/transformers/", line 537, in _report_to_hp_search
    self.objective = self.compute_objective(metrics.copy())
  File "/usr/local/lib/python3.6/dist-packages/transformers/", line 120, in default_compute_objective
    "Metrics contains more entries than just 'eval_loss', 'epoch' and 'total_flos', please provide your own compute_objective function."
RuntimeError: Metrics contains more entries than just 'eval_loss', 'epoch' and 'total_flos', please provide your own compute_objective function.
RuntimeError                              Traceback (most recent call last)
<ipython-input-26-12c3f54763db> in <module>()
----> 1 best_run = trainer.hyperparameter_search(n_trials=10, direction="maximize")

10 frames
/usr/local/lib/python3.6/dist-packages/transformers/ in default_compute_objective(metrics)
    118     if len(metrics) != 0:
    119         raise RuntimeError(
--> 120             "Metrics contains more entries than just 'eval_loss', 'epoch' and 'total_flos', please provide your own compute_objective function."
    121         )
    122     return loss

RuntimeError: Metrics contains more entries than just 'eval_loss', 'epoch' and 'total_flos', please provide your own compute_objective function.

I tried passing the dict produced by trainer.evaluate() in the compute_objective arg, but this complains that TypeError: 'dict' object is not callable.

I’d be happy to fix the docs / code if you can give me some pointers on where to start!

I did realize and this has been fixed in this commit. Thanks for warning me :slight_smile:


@sgugger Early tests from the master branch seem to indicate the change in epochs is gone ( unless I’m getting very unlucky with random numbers )… Actually, I’m no longer seeing this on the latest pip release either, I think. Thanks ! Doing great things.

Hi @sgugger! Do you have any suggestion about how I should be able to use the hyperparameter-search from Trainer with optuna as backend and integrate it with wandb?

When I try to do it it returns that I can only use only one wandb per model ):

It throws the following error:
You can only call once per model. Pass a new instance of the model if you need to call again in your code.

I have no idea of where the problem lies. I’ll look at it when I have some time, but we usually let the maintainers of the third-party libraries like optuna and wandb fix the integrations themselves as they know their tools better :slight_smile:

Hi @sgugger! I’m trying to train my model using PopulationBasedTraining from ray. This is how I’m doing the search:

from ray.tune.schedulers import PopulationBasedTraining
from ray.tune import uniform
from random import randint

scheduler = PopulationBasedTraining(
    mode = "max",
        "weight_decay": lambda: uniform(0.0, 0.3),
        "learning_rate": lambda: uniform(1e-5, 5e-5),
        "per_gpu_train_batch_size": [16, 32, 64],
        "num_train_epochs": [2,3,4],
        "warmup_steps":lambda: randint(0, 500)

best_trial = trainer.hyperparameter_search(

I’m getting this error:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/ray/tune/", line 726, in _process_trial
    result = self.trial_executor.fetch_result(trial)
  File "/usr/local/lib/python3.6/dist-packages/ray/tune/", line 489, in fetch_result
    result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
  File "/usr/local/lib/python3.6/dist-packages/ray/", line 1452, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(TuneError): ray::ImplicitFunc.train() (pid=1971, ip=
  File "python/ray/_raylet.pyx", line 482, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 436, in ray._raylet.execute_task.function_executor
  File "/usr/local/lib/python3.6/dist-packages/ray/tune/", line 336, in train
    result = self.step()
  File "/usr/local/lib/python3.6/dist-packages/ray/tune/", line 366, in step
  File "/usr/local/lib/python3.6/dist-packages/ray/tune/", line 513, in _report_thread_runner_error
ray.tune.error.TuneError: Trial raised an exception. Traceback:
ray::ImplicitFunc.train() (pid=1971, ip=
  File "/usr/local/lib/python3.6/dist-packages/ray/tune/", line 248, in run
  File "/usr/local/lib/python3.6/dist-packages/ray/tune/", line 316, in entrypoint
  File "/usr/local/lib/python3.6/dist-packages/ray/tune/", line 575, in _trainable_func
    output = fn()
  File "/usr/local/lib/python3.6/dist-packages/transformers/", line 180, in _objective
    trainer.train(model_path=model_path, trial=trial)
  File "/usr/local/lib/python3.6/dist-packages/transformers/", line 577, in train
  File "/usr/local/lib/python3.6/dist-packages/transformers/", line 519, in _hp_search_setup
    value = type(old_attr)(value)
TypeError: float() argument must be a string or a number, not 'Float'

I have tried to use hp_space instead of defining the parameters inside of the scheduler, but the parameters don’t appear on the training and I get a similar error with ìnt instead of float.

I have no idea of why you get this type Float that is not castable to float. If this is the return type of ray.tune.uniform, I think you might have to add something to convert it to a regular Python float in your lambda functions.

I don’t know what really happened but I changed parameters to:

        "weight_decay": tune.uniform(0.0, 0.3),
        "learning_rate": tune.uniform(1e-5, 5e-5),
        "per_device_train_batch_size": tune.choice([16, 32, 64]),
        "num_train_epochs": tune.choice([2,3,4]),
        "warmup_steps":tune.choice(range(0, 500))

and it seems to work.

But now I have another problem. I created a custom function to return accuracy, which is passed to the trainer. I want to use that accuracy as metric in ray. I saw the example compute_objective function that you posted, but I don’t know what is metrics and how to use accuracy.

Hi @tr3cks

here metrics is dict which contains the metrics you defined, loss, accuracy etc.

So to use accuracy as metric/objective for hparam search, you should return accuracy value from the compute_objective function.

If your key is accuracy then you could return metrics["accuracy"] from the compute_objective function.