Understanding how Ray "objective" works with the trainer

I implemented ray to use with the trainer, and I am very satisfied about how easy it is to set up a basic hyperparameter search, kudos! I am trying to be more confident in my understanding of how Ray’s objective works together with the trainer.

If I run some optimizations, it looks like this:

+------------------------+------------+-----------------+-----------------+-------------------------------+--------+-------------+
| Trial name             | status     | loc             |   learning_rate |   per_device_train_batch_size |   seed |   objective |
|------------------------+------------+-----------------+-----------------+-------------------------------+--------+-------------|
| _objective_534cc_00003 | RUNNING    | 127.0.0.1:41568 |     0.000535051 |                             8 |    813 |    0.107262 |
| _objective_534cc_00000 | TERMINATED | 127.0.0.1:24216 |     1.77449e-06 |                             8 |    638 |    2.19867  |
| _objective_534cc_00001 | TERMINATED | 127.0.0.1:43104 |     4.51132e-06 |                             8 |    832 |    2.25701  |
| _objective_534cc_00002 | TERMINATED | 127.0.0.1:34916 |     2.69282e-05 |                             8 |    855 |    2.41596  |
+------------------------+------------+-----------------+-----------------+-------------------------------+--------+-------------+

But it is not clear to me what the “objective” is. I know that this is what Ray tries to “maximize” or “minimize” depending on how you initialized it, but I don’t think it is loss.

From reading @sgugger’s post, my guess is that if you have a compute_metrics custom method that returns values, the “objective” is the sum of all these values. If you don’t have compute_metrics, it does use the loss.

So if you have compute_metrics that looks like this:

def compute_metrics(pred):
    labels = pred.label_ids
    preds = pred.predictions.argmax(-1)
    precision, recall, f1, _ = precision_recall_fscore_support(labels, preds, average="weighted")
    acc = accuracy_score(labels, preds)
    return {
        "accuracy": acc,
        "f1": f1,
        "precision": precision,
        "recall": recall
    }

Ray’s “objective” will be the sum of accuracy, f1, precision, and recall. If we want to customize that behaviour, we can pass a custom function to hyperparamter_search(compute_objective=custom_func), e.g.,

def custom_func(metrics):
    """Only optimize for F1"""
    return metrics["f1"]

I am looking for some confirmation that everything that I said is correct. I’m especially not sure about whether the dict values of compute_metrics are indeed summed to get a default objective for Ray, and whether my “custom_func” is used correctly.

Thanks!

After some further digging, I can confirm that my first post is correct. If you don’t specify a custom function for compute_objective, the sum of the values of compute_metrics is used. If you do not specify your own compute_metrics, the loss will be used. Note that this is important! You often want to minimize your loss but maximize secondary metrics (like F1, correlation scores, etc.), which is a crucial parameter in hyperparamter_search(direction=["minimize", "maximize"]).