SQuAD/BERT: Why max_length=384 by default and not 512?

We use the same default as the Google scripts to reproduce their results. I’m guessing the 384 was a compromise for the regular SQUAD dataset between having most question/contexts be tokenized without any truncation while keeping something small to go fast.

1 Like