Transformers BERT QA Task: vs

I am going to use transformers to prune a BERT model for downstream task squadv1.1. And there are two scripts in the examples of transformers. ( and

Because I have to add some additional code to the training process, I choose the script without the trainer API.

However, I can’t reproduce the squadv1 result with this script. Can anyone provide a suite of hyperparameters that can help me to train a BERT model with that can generate an f1 score of 88 in the Squad v.1 task?

(I have 4 super 2020 Ti GPU)

Or is there any optimization for should I transfer to

Thank you very much.