Multi Objective Hyperparameter Optimization

josephleyva · March 6, 2025, 6:19pm

Hello, I was looking at the documentation of the hyperparameter_search() method : Documentation Link from the Trainer API and I came up with a question.

This is related to multi objective optimization:

direction (str or List[str], optional, defaults to "minimize") — If it’s single objective optimization, direction is str, can be "minimize" or "maximize", you should pick "minimize" when optimizing the validation loss, "maximize" when optimizing one or several metrics. If it’s multi objectives optimization, direction is List[str], can be List of "minimize" and "maximize", you should pick "minimize" when optimizing the validation loss, "maximize" when optimizing one or several metrics.

Can anyone show me an example of this use case?

Like for example, what if I want to maximize the accuracy and f1 score, but also minimize the eval loss?

John6666 · March 6, 2025, 10:37pm

Hmmm…

To answer your question about using the direction parameter in hyperparameter_search() for multi-objective optimization, let’s break it down using the context you provided and the documentation.

Example Use Case: Multi-Objective Optimization

Suppose you want to maximize the accuracy and maximize the F1 score, while simultaneously minimizing the evaluation loss. This is a multi-objective optimization problem because you have three conflicting objectives: two to maximize and one to minimize.

Steps to Implement This Use Case

Define the Metrics:
Create a compute_metrics function that returns the metrics you want to optimize (e.g., accuracy, F1 score, and evaluation loss). The order of the metrics returned in compute_metrics must match the order of the direction list in hyperparameter_search().

from sklearn.metrics import accuracy_score, f1_score

def compute_metrics(eval_pred):
    predictions = eval_pred.predictions.argmax(axis=1)
    labels = eval_pred.label_ids
    accuracy = accuracy_score(labels, predictions)
    f1 = f1_score(labels, predictions, average="weighted")
    # Assuming evaluation loss is computed elsewhere
    eval_loss = ...  # Replace with actual evaluation loss
    return {"accuracy": accuracy, "f1": f1, "eval_loss": eval_loss}

Set Up the Trainer:
Initialize the Trainer with your model, datasets, and the compute_metrics function.

from transformers import Trainer, TrainingArguments

model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased")
training_args = TrainingArguments(output_dir="results", evaluation_strategy="epoch")
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    compute_metrics=compute_metrics,
)

Define the Hyperparameter Search Space:
Specify the hyperparameters you want to optimize (e.g., learning rate, weight decay).
```
hp_space = {
    "learning_rate": [2e-5, 3e-5],
    "weight_decay": [0.0, 0.1],
}
```
Perform Hyperparameter Search:
Use the hyperparameter_search() method with the direction parameter set to ["maximize", "maximize", "minimize"] to reflect your optimization goals: maximize accuracy, maximize F1 score, and minimize evaluation loss.
```
best_run = trainer.hyperparameter_search(
    hp_space=hp_space,
    direction=["maximize", "maximize", "minimize"],  # Match the order of metrics
    backend="optuna",  # or "ray", "sigopt", etc.
    n_trials=32,
)
```

Key Points to Note

The direction parameter must be a list of strings (["maximize", "maximize", "minimize"]) to specify the optimization direction for each objective in the order they are returned by compute_metrics.
If you do not provide a custom compute_objective function, the default behavior is to sum the metric values, which might not always align with your optimization goals. Therefore, you may need to define a custom compute_objective function to handle multi-objective optimization properly.
The backend parameter determines the optimization framework (e.g., “optuna”, “ray”, “sigopt”). Each backend may have its own requirements for the hyperparameter space, so refer to the documentation for the backend you are using [1].

References

For additional details, refer to the following sources:

[1] Documentation on hyperparameter_search() and its parameters: Trainer
[2] Examples of hyperparameter search implementations: Trainer

josephleyva · March 6, 2025, 11:47pm

I noticed that the eval_loss is calculated internally by the Trainer and returned on the dictionary of metrics.

Let’s say I have the following objective

def objective(metrics):
    # could be any computation
    return metrics["eval_accuracy"] + metrics["eval_f1"]

The Trainer will add that objective as an extra metric. If I want to maximize it, but also minimize the loss, how should the hyperparameter_search knows the order of metrics when the objective appears in order to put the direction parameter correctly?

best_run = trainer.hyperparameter_search(
    hp_space=hp_space,
    compute_objective=objective,
    direction= ...
    backend="optuna",  # or "ray", "sigopt", etc.
    n_trials=32,
    
)

John6666 · March 7, 2025, 2:11am

by HuggingChat

To handle multi-objective optimization in Optuna where you want to maximize one metric and minimize another, follow these steps:

Modify the Objective Function: Update the objective function to return a tuple containing both metrics. This allows Optuna to treat each metric separately with its own direction.

def objective(metrics):
    # Metric to maximize
    combined_metric = metrics["eval_accuracy"] + metrics["eval_f1"]
    # Metric to minimize
    loss = metrics["loss"]
    return (combined_metric, loss)

Set Directions in hyperparameter_search: Specify the directions for each objective in the direction parameter as a list.

best_run = trainer.hyperparameter_search(
    hp_space=hp_space,
    compute_objective=objective,
    direction=["maximize", "minimize"],
    backend="optuna",
    n_trials=32,
)

By structuring your objective function to return separate metrics and specifying the respective directions, Optuna will correctly optimize each objective according to its direction.

Answer: The solution involves modifying the objective function to return both metrics as separate elements and specifying the directions as [“maximize”, “minimize”]. This allows Optuna to handle each metric with its respective optimization direction.

Answer

To handle multi-objective optimization in Optuna where you want to maximize one metric and minimize another, you should modify your objective function to return a tuple of the metrics you want to optimize. Then, you specify the direction for each metric as a list in the direction parameter of the hyperparameter_search method.

For example, if you want to maximize the combined accuracy and F1 score while minimizing the loss, you should structure your objective function to return both metrics, and set the direction as ["maximize", "minimize"]. Here’s how you can implement it:

def objective(metrics):
    # Combine accuracy and F1 score (to maximize)
    combined_metric = metrics["eval_accuracy"] + metrics["eval_f1"]
    # Loss (to minimize)
    loss = metrics["loss"]
    return (combined_metric, loss)

best_run = trainer.hyperparameter_search(
    hp_space=hp_space,
    compute_objective=objective,
    direction=["maximize", "minimize"],  # [1]
    backend="optuna",
    n_trials=32,
)

This setup ensures that Optuna optimizes the combined metric in the “maximize” direction and the loss in the “minimize” direction[1].

[1]

Topic		Replies	Views
Hyper params search for model config 🤗Transformers	0	167	February 22, 2024
Adding learnable coefficients for multi-objective losses? Research	2	757	November 25, 2020
Making sense of duplicate arguments in Huggingface's hyperparameter search work flow 🤗Transformers	3	1015	October 8, 2021
How to tune optimizer hyperparameters with Trainer.hyperparameter_search 🤗Transformers	0	849	March 6, 2023
Getting worst F1 and accuracy results after hyperparameter optimization Beginners	1	854	March 2, 2023