Hi,
I finetuned the google/mobilebert-uncased
model on MRPC and got ~87% validation accuracy. When I evaluate on the test
split, I get only ~83%. In the mobileBERT paper, they also report something like 87%.
My question: Am I doing something wrong or are they using the validation split in the paper?