Tensorflow Fine Tuning Notebook - MRPC dataset

Hi,

I fine tune different transformers models on MRPC dataset using Tensorflow.

But when I train the model and evaluate it - it shoes good accuracy.

But when I make predictions on logits and convert to probabilities and check on metrics - accuracy and F1 score goes very down.

I also checked on GLUE benchmark metrics - accuracy , F1 score are still very less

Kindly help, I tried it one different datasets, but results are same for everyone.