This model has much higher F1/EM when evaluated on the validation squad2 data set than what is in the model card. Is the card outdated?
This model has much higher F1/EM when evaluated on the validation squad2 data set than what is in the model card. Is the card outdated?