Label smoothing and compute_metrics in Trainer

grakocevic · September 7, 2021, 4:00pm

I’m using RobertaForMaskedLM model with a Trainer and I’m passing a compute_metrics function.
Within the function I typically do something like this:

mask = p.label_ids != -100
labels = p.label_ids[mask]
predictions = np.argmax(p.predictions, axis=-1)[mask]
accuracy_metric.compute(predictions=predictions, references=labels)

With this I hope to get the metric calculate only with respect to the characters that were actually masked. However, if I turn on label smoothing (pass label_smoothing_factor to the Trainer), the smoothing process seems to clamp the minimum value to 0 in place (line 461, trainer_pt_utils.py).

Have I approached this in the wrong way, ie. is there another way computing the metrics?

sgugger · September 7, 2021, 4:55pm

Ah yes, the labels should not necessarily be modified inplace, this looks like a bug.

This PR should fix it.

Topic		Replies	Views
Trainer doesn't get to compute_metrics after upgrading to v4.32 🤗Transformers	4	1442	July 2, 2024
Accuracy of Masked LM training Beginners	0	1034	June 15, 2022
Couple of questions about Trainer Beginners	0	329	June 13, 2023
How to define the compute_metrics() function in Trainer? 🤗Transformers	3	16551	December 20, 2021
Accessing labels in the compute_metrics function Beginners	4	95	January 15, 2025

Label smoothing and compute_metrics in Trainer

Related topics