I fine tune different transformers models on MRPC dataset using Tensorflow.
But when I train the model and evaluate it - it shoes good accuracy.
But when I make predictions on logits and convert to probabilities and check on metrics - accuracy and F1 score goes very down.
I also checked on GLUE benchmark metrics - accuracy , F1 score are still very less
Kindly help, I tried it one different datasets, but results are same for everyone.