Multi Objective Hyperparameter Optimization

Hello, I was looking at the documentation of the hyperparameter_search() method : Documentation Link from the Trainer API and I came up with a question.

This is related to multi objective optimization:

  • direction (str or List[str], optional, defaults to "minimize") — If it’s single objective optimization, direction is str, can be "minimize" or "maximize", you should pick "minimize" when optimizing the validation loss, "maximize" when optimizing one or several metrics. If it’s multi objectives optimization, direction is List[str], can be List of "minimize" and "maximize", you should pick "minimize" when optimizing the validation loss, "maximize" when optimizing one or several metrics.

Can anyone show me an example of this use case?

Like for example, what if I want to maximize the accuracy and f1 score, but also minimize the eval loss?

1 Like

Hmmm…


To answer your question about using the direction parameter in hyperparameter_search() for multi-objective optimization, let’s break it down using the context you provided and the documentation.


Example Use Case: Multi-Objective Optimization

Suppose you want to maximize the accuracy and maximize the F1 score, while simultaneously minimizing the evaluation loss. This is a multi-objective optimization problem because you have three conflicting objectives: two to maximize and one to minimize.

Steps to Implement This Use Case

  1. Define the Metrics:
    Create a compute_metrics function that returns the metrics you want to optimize (e.g., accuracy, F1 score, and evaluation loss). The order of the metrics returned in compute_metrics must match the order of the direction list in hyperparameter_search().

    from sklearn.metrics import accuracy_score, f1_score
    
    def compute_metrics(eval_pred):
        predictions = eval_pred.predictions.argmax(axis=1)
        labels = eval_pred.label_ids
        accuracy = accuracy_score(labels, predictions)
        f1 = f1_score(labels, predictions, average="weighted")
        # Assuming evaluation loss is computed elsewhere
        eval_loss = ...  # Replace with actual evaluation loss
        return {"accuracy": accuracy, "f1": f1, "eval_loss": eval_loss}
    
  2. Set Up the Trainer:
    Initialize the Trainer with your model, datasets, and the compute_metrics function.

    from transformers import Trainer, TrainingArguments
    
    model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased")
    training_args = TrainingArguments(output_dir="results", evaluation_strategy="epoch")
    trainer = Trainer(
        model=model,
        args=training_args,
        train_dataset=train_dataset,
        eval_dataset=eval_dataset,
        compute_metrics=compute_metrics,
    )
    
  3. Define the Hyperparameter Search Space:
    Specify the hyperparameters you want to optimize (e.g., learning rate, weight decay).

    hp_space = {
        "learning_rate": [2e-5, 3e-5],
        "weight_decay": [0.0, 0.1],
    }
    
  4. Perform Hyperparameter Search:
    Use the hyperparameter_search() method with the direction parameter set to ["maximize", "maximize", "minimize"] to reflect your optimization goals: maximize accuracy, maximize F1 score, and minimize evaluation loss.

    best_run = trainer.hyperparameter_search(
        hp_space=hp_space,
        direction=["maximize", "maximize", "minimize"],  # Match the order of metrics
        backend="optuna",  # or "ray", "sigopt", etc.
        n_trials=32,
    )
    

Key Points to Note

  • The direction parameter must be a list of strings (["maximize", "maximize", "minimize"]) to specify the optimization direction for each objective in the order they are returned by compute_metrics.
  • If you do not provide a custom compute_objective function, the default behavior is to sum the metric values, which might not always align with your optimization goals. Therefore, you may need to define a custom compute_objective function to handle multi-objective optimization properly.
  • The backend parameter determines the optimization framework (e.g., “optuna”, “ray”, “sigopt”). Each backend may have its own requirements for the hyperparameter space, so refer to the documentation for the backend you are using [1].

References

For additional details, refer to the following sources:

  • [1] Documentation on hyperparameter_search() and its parameters: Trainer
  • [2] Examples of hyperparameter search implementations: Trainer

I noticed that the eval_loss is calculated internally by the Trainer and returned on the dictionary of metrics.

Let’s say I have the following objective

def objective(metrics):
    # could be any computation
    return metrics["eval_accuracy"] + metrics["eval_f1"]

The Trainer will add that objective as an extra metric. If I want to maximize it, but also minimize the loss, how should the hyperparameter_search knows the order of metrics when the objective appears in order to put the direction parameter correctly?

best_run = trainer.hyperparameter_search(
    hp_space=hp_space,
    compute_objective=objective,
    direction= ...
    backend="optuna",  # or "ray", "sigopt", etc.
    n_trials=32,
    
)
1 Like

by HuggingChat


To handle multi-objective optimization in Optuna where you want to maximize one metric and minimize another, follow these steps:

  1. Modify the Objective Function: Update the objective function to return a tuple containing both metrics. This allows Optuna to treat each metric separately with its own direction.

    def objective(metrics):
        # Metric to maximize
        combined_metric = metrics["eval_accuracy"] + metrics["eval_f1"]
        # Metric to minimize
        loss = metrics["loss"]
        return (combined_metric, loss)
    
  2. Set Directions in hyperparameter_search: Specify the directions for each objective in the direction parameter as a list.

    best_run = trainer.hyperparameter_search(
        hp_space=hp_space,
        compute_objective=objective,
        direction=["maximize", "minimize"],
        backend="optuna",
        n_trials=32,
    )
    

By structuring your objective function to return separate metrics and specifying the respective directions, Optuna will correctly optimize each objective according to its direction.

Answer: The solution involves modifying the objective function to return both metrics as separate elements and specifying the directions as [“maximize”, “minimize”]. This allows Optuna to handle each metric with its respective optimization direction.

Answer

To handle multi-objective optimization in Optuna where you want to maximize one metric and minimize another, you should modify your objective function to return a tuple of the metrics you want to optimize. Then, you specify the direction for each metric as a list in the direction parameter of the hyperparameter_search method.

For example, if you want to maximize the combined accuracy and F1 score while minimizing the loss, you should structure your objective function to return both metrics, and set the direction as ["maximize", "minimize"]. Here’s how you can implement it:

def objective(metrics):
    # Combine accuracy and F1 score (to maximize)
    combined_metric = metrics["eval_accuracy"] + metrics["eval_f1"]
    # Loss (to minimize)
    loss = metrics["loss"]
    return (combined_metric, loss)

best_run = trainer.hyperparameter_search(
    hp_space=hp_space,
    compute_objective=objective,
    direction=["maximize", "minimize"],  # [1]
    backend="optuna",
    n_trials=32,
)

This setup ensures that Optuna optimizes the combined metric in the “maximize” direction and the loss in the “minimize” direction[1].

[1]

1 Like