I’m using hyperparameter_search
for hyperparamter tuning with the following configurations:
def tune_config_ray(trial):
return {"learning_rate": tune.choice([5e-5, 4e-5, 3e-5, 2e-5]),
"num_train_epochs": tune.choice([4]),
"per_device_train_batch_size": tune.choice([16])
}
best_trial = trainer.hyperparameter_search(hp_space=tune_config_ray,
backend='ray',
direction='maximize',
n_trials=4,
)
Based on the config, there are 4 unique combinations for the learning_rate
, num_train_epochs
, and per_device_train_batch_size
. However, when I run the tuning (as you can see below), I see some duplicates among the trials. I wonder why this happens and how I can have non-duplicate trials? Is this possibly because ray is also tuning some other hyperparameters that are not listed in the report and that is why I see some duplicates?
+------------------------+----------+-------+-----------------+--------------------+-------------------------------+
| Trial name | status | loc | learning_rate | num_train_epochs | per_device_train_batch_size |
|------------------------+----------+-------+-----------------+--------------------+-------------------------------|
| _objective_7efa7_00000 | RUNNING | | 3e-05 | 4 | 16 |
| _objective_7efa7_00001 | PENDING | | 2e-05 | 4 | 16 |
| _objective_7efa7_00002 | PENDING | | 5e-05 | 4 | 16 |
| _objective_7efa7_00003 | PENDING | | 3e-05 | 4 | 16 |
+------------------------+----------+-------+-----------------+--------------------+-------------------------------+