SQuAD/BERT: Why max_length=384 by default and not 512?

mgreenbe · November 14, 2021, 7:53pm

Why do training scripts for fine-tuning BERT-based models on SQuAD (e.g., this one from google or this one from HuggingFace, use set a maximum length of 384 (by default) for input sequences even though the models can handle inputs of length up to 512? (This maximum length refers to the combined length of the question and context, right? Regardless, the questions in the SQuAD dataset typically have length significantly less than 128.)

sgugger · November 15, 2021, 12:58am

We use the same default as the Google scripts to reproduce their results. I’m guessing the 384 was a compromise for the regular SQUAD dataset between having most question/contexts be tokenized without any truncation while keeping something small to go fast.

Topic		Replies	Views
Unit of max_answer_length in run_qa.py script? 🤗Transformers	1	531	February 4, 2022
Why does increasing sequence length reduce Q&A performance on my test set? Intermediate	0	349	August 30, 2021
Why does padding = 'max_length' cause much slower model inference? Models	1	621	June 8, 2023
Fine-tuning BERT with sequences longer than 512 tokens Models	7	27652	April 4, 2022
The input length for bert 🤗Transformers	0	188	March 24, 2023

SQuAD/BERT: Why max_length=384 by default and not 512?

Related topics