PyTorch version

sgugger · December 16, 2020, 3:07pm

This is all to mimic the original implementation of RoBERTa. So no, RoBERTa does not use sinusoidal position embeddings. That’s also why we can’t change the padding_index for the posistion_ids as it would break from the pretrained models.

Topic		Replies	Views
Positional encoding error in RoBERTa 🤗Transformers	1	338	October 2, 2023
Positional Encoding error, Protein Bert Model Intermediate	2	654	October 25, 2020
Claritifcation about the `max_position_embeddings` argument 🤗Transformers	1	504	January 27, 2023
Error using `max_length` in transformers 🤗Transformers	3	2705	February 26, 2021
Different size of Roberta-base tokenizer and model embedding Beginners	1	1134	March 1, 2022

PyTorch version

Related topics