I’m using a RobertaForMaskedLM model initialized with the following configuration:
config = RobertaConfig(vocab_size = tokenizer.vocab_size, max_position_embeddings = 128)
because I padded my input tokens to a size of 128.
However, during the training, I get the following error message:
IndexError: index out of range in self
which comes from the position_ids in the position_embedding layer.
Indeed, the positional embedding layer is initialized with the argument max_position_embeddings (equals to 128 here):
self.position_embeddings = nn.Embedding(config.max_position_embeddings, ...)
whereas the function which creates position_ids generates numbers from padding_idx to padding_idx + max_position_embeddings (so 129 > 127).
Therefore, it seems that one needs to choose model_max_length = (128 + padding_idx + 1) to avoid errors, but I can’t see any explanation of this in the documentation. Can anyone explain what I haven’t understood?