T5 for IOB tagging requires prediction shifting

Hej everyone,

I constructed a T5 model for QA by loading a pretrained model, adjusting the LM head and decoder embeddings. I then trained the model and it works almost flawlessly, except for the fact that the predictions are shifted to the right by 1 compared to the labels. If I shift the predictions left 1 slot, the model has 1.0 precision f1 and recall. I use 3 labels, 0 for B, 1 for outside and 2 for inside.

from sklearn.metrics import classification_report
import numpy as np
labels = val_preds.label_ids.reshape(-1)
predictions = val_preds.predictions[0].argmax(-1).reshape(-1)
predictions = predictions[labels != -100]
labels = labels[labels != -100]
predictions = np.where(predictions == 2, 1, 0)
labels = np.where(labels == 2, 1, 0)
predictions = np.roll(predictions, -1)
print(classification_report(labels, predictions, target_names=["O", "I"]))

I understand that the labels become inputs to the decoder when doing teacher force and we prepend the bos token in the labels; that doesn’t explain why the predictions are wrong.

T5 uses teacher-forcing, so my hunch is that the model simply learned to predict the input as given by the teacher.