PyTorch version

sgugger · December 15, 2020, 9:25pm

Hi there!
It’s hard to know exactly what’s going on without seeing your code but here is what I can share about RoBERTa. You should not use max_position_embeddings as a maximum sequence length. Because the position IDs of RoBERTa go from padding_index to maximum_sequence_length + padding_index, this max_position_embeddings is purposely set to 514 (2, the padding index + 512, the maximum sequence length). You should use tokenizer.model_max_length instead (which should be 512).

Topic		Replies	Views
Positional encoding error in RoBERTa 🤗Transformers	1	338	October 2, 2023
Positional Encoding error, Protein Bert Model Intermediate	2	654	October 25, 2020
Claritifcation about the `max_position_embeddings` argument 🤗Transformers	1	504	January 27, 2023
Error using `max_length` in transformers 🤗Transformers	3	2705	February 26, 2021
Different size of Roberta-base tokenizer and model embedding Beginners	1	1135	March 1, 2022

PyTorch version

Related topics