I implemented ray to use with the trainer, and I am very satisfied about how easy it is to set up a basic hyperparameter search, kudos! I am trying to be more confident in my understanding of how Ray’s objective works together with the trainer.
If I run some optimizations, it looks like this:
+------------------------+------------+-----------------+-----------------+-------------------------------+--------+-------------+
| Trial name | status | loc | learning_rate | per_device_train_batch_size | seed | objective |
|------------------------+------------+-----------------+-----------------+-------------------------------+--------+-------------|
| _objective_534cc_00003 | RUNNING | 127.0.0.1:41568 | 0.000535051 | 8 | 813 | 0.107262 |
| _objective_534cc_00000 | TERMINATED | 127.0.0.1:24216 | 1.77449e-06 | 8 | 638 | 2.19867 |
| _objective_534cc_00001 | TERMINATED | 127.0.0.1:43104 | 4.51132e-06 | 8 | 832 | 2.25701 |
| _objective_534cc_00002 | TERMINATED | 127.0.0.1:34916 | 2.69282e-05 | 8 | 855 | 2.41596 |
+------------------------+------------+-----------------+-----------------+-------------------------------+--------+-------------+
But it is not clear to me what the “objective” is. I know that this is what Ray tries to “maximize” or “minimize” depending on how you initialized it, but I don’t think it is loss.
From reading @sgugger’s post, my guess is that if you have a compute_metrics
custom method that returns values, the “objective” is the sum of all these values. If you don’t have compute_metrics
, it does use the loss.
So if you have compute_metrics
that looks like this:
def compute_metrics(pred):
labels = pred.label_ids
preds = pred.predictions.argmax(-1)
precision, recall, f1, _ = precision_recall_fscore_support(labels, preds, average="weighted")
acc = accuracy_score(labels, preds)
return {
"accuracy": acc,
"f1": f1,
"precision": precision,
"recall": recall
}
Ray’s “objective” will be the sum of accuracy, f1, precision, and recall. If we want to customize that behaviour, we can pass a custom function to hyperparamter_search(compute_objective=custom_func)
, e.g.,
def custom_func(metrics):
"""Only optimize for F1"""
return metrics["f1"]
I am looking for some confirmation that everything that I said is correct. I’m especially not sure about whether the dict values of compute_metrics
are indeed summed to get a default objective for Ray, and whether my “custom_func” is used correctly.
Thanks!