Is there a pre-trained BERT model with the sequence length 2048?

Hello,
I want to use the pre-trained BERT model because I do not want to train the entire BERT model to analyze my data. Is there a pre-trained BERT model with sequence length 2048?

or are all pre-trained BERT model only have the sequence length of 512?

Thank you.

Hi, instead of Bert, you may be interested in Longformer which has a pretrained weights on seq. length of 4096

2 Likes

I’ve not seen a pre-trained BERT with sequence length 2048.
Training expense goes up as the square of the sequence length, so len 2048 would be very costly to produce. I think you would need a TPU to do fine-tuning on it, even if it exists.
See this page https://github.com/google-research/bert