Trainer.Hyperparameter_search() Trials did not complete. How to optimize parameters with ray tune?

Mel-Iza0 · January 10, 2023, 12:37pm

Hello!
I used a deberta model (microsoft-deberta-v3-base) finetuning it and saved a checkpoint from it for inference.

I would like to do hyperparameter optimization with Ray Tune. My strategy was to load the model checkpoint and try the hyperparameter_search but when I try to use, it brings this error.

TuneError                                 Traceback (most recent call last)
<ipython-input-35-8e89fe27cbea> in <module>
     23   )
     24 
---> 25 best_run = trainer.hyperparameter_search(
     26   direction="maximize",
     27   n_trials=2

2 frames
/usr/local/lib/python3.8/dist-packages/transformers/trainer.py in hyperparameter_search(self, hp_space, compute_objective, n_trials, direction, backend, hp_name, **kwargs)
   2414             HPSearchBackend.WANDB: run_hp_search_wandb,
   2415         }
-> 2416         best_run = backend_dict[backend](self, n_trials, direction, **kwargs)
   2417 
   2418         self.hp_search_backend = None

/usr/local/lib/python3.8/dist-packages/transformers/integrations.py in run_hp_search_ray(trainer, n_trials, direction, **kwargs)
    336         dynamic_modules_import_trainable.__mixins__ = trainable.__mixins__
    337 
--> 338     analysis = ray.tune.run(
    339         dynamic_modules_import_trainable,
    340         config=trainer.hp_space(None),

/usr/local/lib/python3.8/dist-packages/ray/tune/tune.py in run(run_or_experiment, name, metric, mode, stop, time_budget_s, config, resources_per_trial, num_samples, local_dir, search_alg, scheduler, keep_checkpoints_num, checkpoint_score_attr, checkpoint_freq, checkpoint_at_end, verbose, progress_reporter, log_to_file, trial_name_creator, trial_dirname_creator, chdir_to_trial_dir, sync_config, export_formats, max_failures, fail_fast, restore, server_port, resume, reuse_actors, trial_executor, raise_on_failed_trial, callbacks, max_concurrent_trials, _experiment_checkpoint_dir, _remote, _remote_string_queue)
    754     if incomplete_trials:
    755         if raise_on_failed_trial and not state["signal"]:
--> 756             raise TuneError("Trials did not complete", incomplete_trials)
    757         else:
    758             logger.error("Trials did not complete: %s", incomplete_trials)

TuneError: ('Trials did not complete', [_objective_84707_00000, _objective_84707_00001])

I tried to build based on these two sources here (references) to use ray as a hyperparameter optimizer but I don’t know how to proceed and I’m having trouble. To use the ray optimizer do I need the config function?
I used the first example in hugging face doc as a base and it worked fine with the dataset glue but then i tried to replicate with the model i used and in another dataset I get this same error.

This is my code:

model_checkpoint = '/content/microsoft-deberta-v3-base_dataset_size-200_epochs-2_batch_size-32'

def model_init():
    return AutoModelForSequenceClassification.from_pretrained(model_checkpoint, num_labels=3)

args = TrainingArguments(
      output_dir= "/content/ZeroTraining/",
      num_train_epochs= config["num_epochs"],
      per_device_train_batch_size= per_device_train_batch_size,
      seed=42,
      evaluation_strategy="steps",
      eval_steps=100,
      disable_tqdm=True

  )

trainer = Trainer(
  args= args,
  tokenizer= tokenizer,
  train_dataset= train_dataset,
  eval_dataset= val_dataset,
  model_init= model_init,
  compute_metrics= compute_metrics,
  )

best_run = trainer.hyperparameter_search(
  direction="maximize",
  n_trials=2 

)

can anybody help me?

References

During the attempts, some doubts arose about the implementation, among them - Can I optimize the hyperparameters along with the training? Can I save the best parameters together with a checkpoint and load the best model? How to acces the model for loading with an id?

Topic		Replies	Views
Why do my attempts to optimize hyperparameters with ray tune keep showing a 'trial error? Beginners	0	526	March 22, 2023
About Hyperparameter Search with Ray Tune 🤗Transformers	2	21	March 7, 2025
There is always something going wrong with hyper parameter tuning 🤗Transformers	4	1978	September 1, 2021
Inconsistency in hyperparameter search results 🤗Transformers	2	637	April 13, 2022
[Ray] How to get the best model per trial 🤗Transformers	1	526	November 18, 2021

Trainer.Hyperparameter_search() Trials did not complete. How to optimize parameters with ray tune?

Related topics