Label smoothing and compute_metrics in Trainer

I’m using RobertaForMaskedLM model with a Trainer and I’m passing a compute_metrics function.
Within the function I typically do something like this:

mask = p.label_ids != -100
labels = p.label_ids[mask]
predictions = np.argmax(p.predictions, axis=-1)[mask]
accuracy_metric.compute(predictions=predictions, references=labels)

With this I hope to get the metric calculate only with respect to the characters that were actually masked. However, if I turn on label smoothing (pass label_smoothing_factor to the Trainer), the smoothing process seems to clamp the minimum value to 0 in place (line 461, trainer_pt_utils.py).

Have I approached this in the wrong way, ie. is there another way computing the metrics?

1 Like

Ah yes, the labels should not necessarily be modified inplace, this looks like a bug.

This PR should fix it.

1 Like