Hi!
I am trying to reproduce the experiment presented in the paper for the basic model with RTE data, without fine tuning.
For the validation RTE set I got an accuracy of 67.8%, while in the paper reported ac accuracy of 73.65%.
Are there any tips for me?
Thank you!