"IndexError: index out of range in self" in BertForPreTraining

hrmello · January 31, 2022, 2:18pm

I’m currently working with BERT applied in time sequences. I’m testing whether or not my inputs are proper for BERT to understand using:

config = BertConfig(vocab_size = 30003,
                    num_attention_heads=12,
                    num_hidden_layers=12)
model = BertForPreTraining(config)
outputs = model(torch.LongTensor(inputs["labels"][54581]).view(-1,43))

Where I used this vocab_size because my tokens range from 0 to 30000 and I added two special tokens, which I named 30002 and 30003, this the size, and I’m resizing the input to (1,43) since I’m just trying to predict a single sequence of length 43 (containing CLS and SEP tokens)…
The input above is of the form:

tensor([30002, 12189, 12818, 13938, 15092, 15906, 16238, 16138, 15772, 15349,
        15094, 15193, 15740, 16740, 18137, 19763, 21208, 21979, 21630, 19799,
        16651, 30003, 14003, 13028, 12250, 11881, 12082, 12807, 13975, 15462,
        17065, 18514, 19534, 19937, 19843, 19390, 18737, 18047, 17449, 16976,
        16575, 16139, 30003])

which seems to be the format BERT recognizes. I also added a token -100 representing [MASK], as suggested in the docs. Then the following error shows:

/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
   2041         # remove once script supports set_grad_enabled
   2042         _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)
-> 2043     return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
   2044 
   2045 

IndexError: index out of range in self

The error is pretty long, so I just put the last iteration. I’m not understanding what I’m supposed to do now. Can anyone give me a clue of what I can do here? Thanks a lot!

Topic		Replies	Views
"IndexError: index out of range in self" for bert LM example on https://huggingface.co/transformers/quickstart.html Beginners	2	6375	October 29, 2020
Sentence pair classification with BertForSequenceClassification cause IndexError: index out of range in self 🤗Transformers	0	1558	November 10, 2022
Certain words don't work with bert? Beginners	2	316	June 15, 2021
BERT encoding for batch of Sentence Pairs raise IndexError: index out of range in self Beginners	1	405	November 16, 2022
BERT finetuning "index out of range in self" Intermediate	2	4138	August 24, 2021

"IndexError: index out of range in self" in BertForPreTraining

Related topics