PyTorch version

david-waterworth · December 15, 2020, 7:02am

OK So I tracked down the crash. The problem was the position embedding. I had max_seq_length == max_position_embeddings, this results in a position index > max_position_embeddings for any sequence which is truncated.

This is because create_position_ids_from_input_ids in modeling_roberta.py' below adds pdding_idx to the cumsum - if there are no masked input_ids this will be > max_seq_length`

mask = input_ids.ne(padding_idx).int()
incremental_indices = torch.cumsum(mask, dim=1).type_as(mask) * mask
return incremental_indices.long() + padding_idx

Topic		Replies	Views
Positional encoding error in RoBERTa 🤗Transformers	1	338	October 2, 2023
Positional Encoding error, Protein Bert Model Intermediate	2	654	October 25, 2020
Claritifcation about the `max_position_embeddings` argument 🤗Transformers	1	504	January 27, 2023
Error using `max_length` in transformers 🤗Transformers	3	2705	February 26, 2021
Different size of Roberta-base tokenizer and model embedding Beginners	1	1135	March 1, 2022

PyTorch version

Related topics