Positional encoding error in RoBERTa

Hello all,

I’m using a RobertaForMaskedLM model initialized with the following configuration:

config = RobertaConfig(vocab_size = tokenizer.vocab_size, max_position_embeddings = 128)

because I padded my input tokens to a size of 128.

However, during the training, I get the following error message:

IndexError: index out of range in self

which comes from the position_ids in the position_embedding layer.

Indeed, the positional embedding layer is initialized with the argument max_position_embeddings (equals to 128 here):

self.position_embeddings = nn.Embedding(config.max_position_embeddings, ...)

whereas the function which creates position_ids generates numbers from padding_idx to padding_idx + max_position_embeddings (so 129 > 127).

Therefore, it seems that one needs to choose model_max_length = (128 + padding_idx + 1) to avoid errors, but I can’t see any explanation of this in the documentation. Can anyone explain what I haven’t understood?