Transformers longformer classification problem with f1, precision and recall classification

I am replicating code from this page and I am getting F1, precision and recall to be 0. I got accuracy as shown by the author. What could be reasons?

I looked into compute_metrics function and it seems to be correct. I tried some toy data as below and precision_recall_fscore_support seems to be giving a correct answer

from sklearn.metrics import accuracy_score, precision_recall_fscore_support

y_pred = [1, 1, 2]
y_true = [1, 2, 2]
print (accuracy_score(y_true, y_pred))

precision_recall_fscore_support(y_true, y_pred, average='binary')

(0.5, 1.0, 0.6666666666666666, None)

as I am getting the accuracy it seems that the below part is working as expected

labels = pred.label_ids
preds = pred.predictions.argmax(-1)

acc = accuracy_score(labels, preds)