Hello, I was looking at the documentation of the hyperparameter_search() method : Documentation Link from the Trainer API and I came up with a question.
This is related to multi objective optimization:
- direction (
str
or List[str]
, optional, defaults to "minimize"
) — If it’s single objective optimization, direction is str
, can be "minimize"
or "maximize"
, you should pick "minimize"
when optimizing the validation loss, "maximize"
when optimizing one or several metrics. If it’s multi objectives optimization, direction is List[str]
, can be List of "minimize"
and "maximize"
, you should pick "minimize"
when optimizing the validation loss, "maximize"
when optimizing one or several metrics.
Can anyone show me an example of this use case?
Like for example, what if I want to maximize the accuracy and f1 score, but also minimize the eval loss?
1 Like
Hmmm…
To answer your question about using the direction
parameter in hyperparameter_search()
for multi-objective optimization, let’s break it down using the context you provided and the documentation.
Example Use Case: Multi-Objective Optimization
Suppose you want to maximize the accuracy and maximize the F1 score, while simultaneously minimizing the evaluation loss. This is a multi-objective optimization problem because you have three conflicting objectives: two to maximize and one to minimize.
Steps to Implement This Use Case
-
Define the Metrics:
Create a compute_metrics
function that returns the metrics you want to optimize (e.g., accuracy, F1 score, and evaluation loss). The order of the metrics returned in compute_metrics
must match the order of the direction
list in hyperparameter_search()
.
from sklearn.metrics import accuracy_score, f1_score
def compute_metrics(eval_pred):
predictions = eval_pred.predictions.argmax(axis=1)
labels = eval_pred.label_ids
accuracy = accuracy_score(labels, predictions)
f1 = f1_score(labels, predictions, average="weighted")
# Assuming evaluation loss is computed elsewhere
eval_loss = ... # Replace with actual evaluation loss
return {"accuracy": accuracy, "f1": f1, "eval_loss": eval_loss}
-
Set Up the Trainer:
Initialize the Trainer
with your model, datasets, and the compute_metrics
function.
from transformers import Trainer, TrainingArguments
model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased")
training_args = TrainingArguments(output_dir="results", evaluation_strategy="epoch")
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
compute_metrics=compute_metrics,
)
-
Define the Hyperparameter Search Space:
Specify the hyperparameters you want to optimize (e.g., learning rate, weight decay).
hp_space = {
"learning_rate": [2e-5, 3e-5],
"weight_decay": [0.0, 0.1],
}
-
Perform Hyperparameter Search:
Use the hyperparameter_search()
method with the direction
parameter set to ["maximize", "maximize", "minimize"]
to reflect your optimization goals: maximize accuracy, maximize F1 score, and minimize evaluation loss.
best_run = trainer.hyperparameter_search(
hp_space=hp_space,
direction=["maximize", "maximize", "minimize"], # Match the order of metrics
backend="optuna", # or "ray", "sigopt", etc.
n_trials=32,
)
Key Points to Note
- The
direction
parameter must be a list of strings (["maximize", "maximize", "minimize"]
) to specify the optimization direction for each objective in the order they are returned by compute_metrics
.
- If you do not provide a custom
compute_objective
function, the default behavior is to sum the metric values, which might not always align with your optimization goals. Therefore, you may need to define a custom compute_objective
function to handle multi-objective optimization properly.
- The
backend
parameter determines the optimization framework (e.g., “optuna”, “ray”, “sigopt”). Each backend may have its own requirements for the hyperparameter space, so refer to the documentation for the backend you are using [1].
References
For additional details, refer to the following sources:
- [1] Documentation on
hyperparameter_search()
and its parameters: Trainer
- [2] Examples of hyperparameter search implementations: Trainer
I noticed that the eval_loss
is calculated internally by the Trainer and returned on the dictionary of metrics
.
Let’s say I have the following objective
def objective(metrics):
# could be any computation
return metrics["eval_accuracy"] + metrics["eval_f1"]
The Trainer will add that objective as an extra metric. If I want to maximize it, but also minimize the loss, how should the hyperparameter_search knows the order of metrics when the objective
appears in order to put the direction
parameter correctly?
best_run = trainer.hyperparameter_search(
hp_space=hp_space,
compute_objective=objective,
direction= ...
backend="optuna", # or "ray", "sigopt", etc.
n_trials=32,
)
1 Like
by HuggingChat
To handle multi-objective optimization in Optuna where you want to maximize one metric and minimize another, follow these steps:
-
Modify the Objective Function: Update the objective function to return a tuple containing both metrics. This allows Optuna to treat each metric separately with its own direction.
def objective(metrics):
# Metric to maximize
combined_metric = metrics["eval_accuracy"] + metrics["eval_f1"]
# Metric to minimize
loss = metrics["loss"]
return (combined_metric, loss)
-
Set Directions in hyperparameter_search: Specify the directions for each objective in the direction
parameter as a list.
best_run = trainer.hyperparameter_search(
hp_space=hp_space,
compute_objective=objective,
direction=["maximize", "minimize"],
backend="optuna",
n_trials=32,
)
By structuring your objective function to return separate metrics and specifying the respective directions, Optuna will correctly optimize each objective according to its direction.
Answer: The solution involves modifying the objective function to return both metrics as separate elements and specifying the directions as [“maximize”, “minimize”]. This allows Optuna to handle each metric with its respective optimization direction.
Answer
To handle multi-objective optimization in Optuna where you want to maximize one metric and minimize another, you should modify your objective function to return a tuple of the metrics you want to optimize. Then, you specify the direction for each metric as a list in the direction
parameter of the hyperparameter_search
method.
For example, if you want to maximize the combined accuracy and F1 score while minimizing the loss, you should structure your objective function to return both metrics, and set the direction as ["maximize", "minimize"]
. Here’s how you can implement it:
def objective(metrics):
# Combine accuracy and F1 score (to maximize)
combined_metric = metrics["eval_accuracy"] + metrics["eval_f1"]
# Loss (to minimize)
loss = metrics["loss"]
return (combined_metric, loss)
best_run = trainer.hyperparameter_search(
hp_space=hp_space,
compute_objective=objective,
direction=["maximize", "minimize"], # [1]
backend="optuna",
n_trials=32,
)
This setup ensures that Optuna optimizes the combined metric in the “maximize” direction and the loss in the “minimize” direction[1].
[1]
1 Like