Duplicate hyperparameter tuning trials with Ray and `hyperparameter_search`

phosseini · August 2, 2021, 8:15pm

I’m using hyperparameter_search for hyperparamter tuning with the following configurations:

    def tune_config_ray(trial):
        return {"learning_rate": tune.choice([5e-5, 4e-5, 3e-5, 2e-5]),
                "num_train_epochs": tune.choice([4]),
                "per_device_train_batch_size": tune.choice([16])
               }

    best_trial = trainer.hyperparameter_search(hp_space=tune_config_ray,
                                               backend='ray',
                                               direction='maximize',
                                               n_trials=4,
                                               )

Based on the config, there are 4 unique combinations for the learning_rate, num_train_epochs, and per_device_train_batch_size. However, when I run the tuning (as you can see below), I see some duplicates among the trials. I wonder why this happens and how I can have non-duplicate trials? Is this possibly because ray is also tuning some other hyperparameters that are not listed in the report and that is why I see some duplicates?

+------------------------+----------+-------+-----------------+--------------------+-------------------------------+
| Trial name             | status   | loc   |   learning_rate |   num_train_epochs |   per_device_train_batch_size |
|------------------------+----------+-------+-----------------+--------------------+-------------------------------|
| _objective_7efa7_00000 | RUNNING  |       |           3e-05 |                  4 |                            16 |
| _objective_7efa7_00001 | PENDING  |       |           2e-05 |                  4 |                            16 |
| _objective_7efa7_00002 | PENDING  |       |           5e-05 |                  4 |                            16 |
| _objective_7efa7_00003 | PENDING  |       |           3e-05 |                  4 |                            16 |
+------------------------+----------+-------+-----------------+--------------------+-------------------------------+

sgugger · August 3, 2021, 6:32am

cc @richardliaw

richardliaw · August 3, 2021, 7:21am

@sgugger thanks for the ping

@phosseini unfortunately you can’t do a sample without replacement with Ray Tune right now, but you can however easily do a grid search:

    def tune_config_ray(trial):
        return {"learning_rate": tune.grid_search([5e-5, 4e-5, 3e-5, 2e-5]),
                "num_train_epochs": tune.choice([4]),
                "per_device_train_batch_size": tune.choice([16])
               }

    best_trial = trainer.hyperparameter_search(hp_space=tune_config_ray,
                                               backend='ray',
                                               direction='maximize',
                                               n_trials=1,  # you may need to set this to 1, or else you'll evaluate the grid search many times over.
                                               )

Topic		Replies	Views
Why do my attempts to optimize hyperparameters with ray tune keep showing a 'trial error? Beginners	0	525	March 22, 2023
How to tune optimizer hyperparameters with Trainer.hyperparameter_search 🤗Transformers	0	855	March 6, 2023
About Hyperparameter Search with Ray Tune 🤗Transformers	2	21	March 7, 2025
Making sense of duplicate arguments in Huggingface's hyperparameter search work flow 🤗Transformers	3	1017	October 8, 2021
Inconsistency in hyperparameter search results 🤗Transformers	2	636	April 13, 2022

Duplicate hyperparameter tuning trials with Ray and `hyperparameter_search`

Related topics