Error when Fine-tuning pretrained Masked Language Model

Neel-Gupta · April 9, 2021, 11:54am

My whole question is here:- python - TypeError: zeros_like(): argument 'input' when fine-tuning on MLM - Stack Overflow

Basically, I am having this error when fine-tuning my pretrained model:

ValueError: expected sequence of length 2033 at dim 1 (got 2036)

Anyone have any idea how I can solve this?

Neel-Gupta · April 9, 2021, 11:36pm

Anyone ? I have put padding=True, so such issues should not exist

Maimonator · April 11, 2021, 10:39am

tokenizer(batch_sentences, padding=True) - padding to max sequence in batch
Maybe you wanted to use:
tokenizer(batch_sentences, padding='max_length') - padding to max model input length

This was taken from the docs:

Neel-Gupta · April 11, 2021, 11:59am

Thanks a lot for the reply @Maimonator !! I had missed putting the max_length arg
I am getting this new error:-

ValueError: expected sequence of length 2000 at dim 1 (got 1981)

Simply put - my tokenization just function doesn’t work Can you see the code I posted on the StackOverflow link for my tok function and how I use the dataset.map to apply it to my dataset?

I personally can’t figure out why it doesn’t work

Maimonator · April 11, 2021, 1:25pm

Is this the latest tokenizer function?

def tok(example):
  encodings = tokenizer(example['src'], truncation=True, padding=True)
  return encodings

Try this instead:

def tok(example):
  encodings = tokenizer(example['src'], truncation=True, padding="max_length", max_length=2000)
  return encodings

Let me know if this works for you

Neel-Gupta · April 11, 2021, 2:27pm

I tried with the different function, but

---------------------------------------------------------------------------

IndexError                                Traceback (most recent call last)

<ipython-input-158-6068ea33d5d4> in <module>()
     45     )
     46 
---> 47 train_results = trainer.train()

11 frames

/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
   1914         # remove once script supports set_grad_enabled
   1915         _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)
-> 1916     return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
   1917 
   1918 

IndexError: index out of range in self

~~Which means there is no truncation/padding being done most probably?~~

Hmm…I removed those arguments completely to see the new error message - which explains that it indeed does truncate and pad to the model’s max input length. So apparently this error is indeed another disconnected one.

Maimonator · April 11, 2021, 2:51pm

Maybe this function is also missing?
Here’s how I use it:
model.resize_token_embeddings(len(tokenizer))

Neel-Gupta · April 11, 2021, 2:59pm

Really appreciate your replies in helping me figure out this weird problem but this gets the same error as posted above index out of range in self.

This seems to be a pretty pesky and weird issue Wish there were more comprehensive examples on simple modelling with HF rather than the Squad and official tasks explored in these examples.

Maimonator · April 11, 2021, 9:41pm

I hear you…
I feel the examples and documentations aren’t as elaborate as we would’ve wished they’d be.

Also I didn’t mention this explicitly, but I’ve set max_length=2000 in this tokenization function:

def tok(example):
  encodings = tokenizer(example['src'], truncation=True, padding="max_length", max_length=2000)
  return encodings

But you should set it to whatever you think is legit.
I don’t have any new ideas as I’m quite new to this library as well, but update us on the development!
Hopefully you’ll solve it soon

Neel-Gupta · April 13, 2021, 7:56pm

day 100 of reporting, still getting this error

---------------------------------------------------------------------------

IndexError                                Traceback (most recent call last)

<ipython-input-38-dda642f3d8b6> in <module>()
     47     )
     48 
---> 49 train_results = trainer.train()

11 frames

/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in train(self, resume_from_checkpoint, trial, **kwargs)
   1118                         tr_loss += self.training_step(model, inputs)
   1119                 else:
-> 1120                     tr_loss += self.training_step(model, inputs)
   1121                 self._total_flos += float(self.floating_point_ops(inputs))
   1122 

/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in training_step(self, model, inputs)
   1522                 loss = self.compute_loss(model, inputs)
   1523         else:
-> 1524             loss = self.compute_loss(model, inputs)
   1525 
   1526         if self.args.n_gpu > 1:

/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in compute_loss(self, model, inputs, return_outputs)
   1554         else:
   1555             labels = None
-> 1556         outputs = model(**inputs)
   1557         # Save past state if it exists
   1558         # TODO: this needs to be fixed and made cleaner later.

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    887             result = self._slow_forward(*input, **kwargs)
    888         else:
--> 889             result = self.forward(*input, **kwargs)
    890         for hook in itertools.chain(
    891                 _global_forward_hooks.values(),

/usr/local/lib/python3.7/dist-packages/transformers/models/longformer/modeling_longformer.py in forward(self, input_ids, attention_mask, global_attention_mask, head_mask, token_type_ids, position_ids, inputs_embeds, labels, output_attentions, output_hidden_states, return_dict)
   1855             output_attentions=output_attentions,
   1856             output_hidden_states=output_hidden_states,
-> 1857             return_dict=return_dict,
   1858         )
   1859         sequence_output = outputs[0]

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    887             result = self._slow_forward(*input, **kwargs)
    888         else:
--> 889             result = self.forward(*input, **kwargs)
    890         for hook in itertools.chain(
    891                 _global_forward_hooks.values(),

/usr/local/lib/python3.7/dist-packages/transformers/models/longformer/modeling_longformer.py in forward(self, input_ids, attention_mask, global_attention_mask, head_mask, token_type_ids, position_ids, inputs_embeds, output_attentions, output_hidden_states, return_dict)
   1662 
   1663         embedding_output = self.embeddings(
-> 1664             input_ids=input_ids, position_ids=position_ids, token_type_ids=token_type_ids, inputs_embeds=inputs_embeds
   1665         )
   1666 

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    887             result = self._slow_forward(*input, **kwargs)
    888         else:
--> 889             result = self.forward(*input, **kwargs)
    890         for hook in itertools.chain(
    891                 _global_forward_hooks.values(),

/usr/local/lib/python3.7/dist-packages/transformers/models/longformer/modeling_longformer.py in forward(self, input_ids, token_type_ids, position_ids, inputs_embeds)
    491         if inputs_embeds is None:
    492             inputs_embeds = self.word_embeddings(input_ids)
--> 493         position_embeddings = self.position_embeddings(position_ids)
    494         token_type_embeddings = self.token_type_embeddings(token_type_ids)
    495 

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    887             result = self._slow_forward(*input, **kwargs)
    888         else:
--> 889             result = self.forward(*input, **kwargs)
    890         for hook in itertools.chain(
    891                 _global_forward_hooks.values(),

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/sparse.py in forward(self, input)
    156         return F.embedding(
    157             input, self.weight, self.padding_idx, self.max_norm,
--> 158             self.norm_type, self.scale_grad_by_freq, self.sparse)
    159 
    160     def extra_repr(self) -> str:

/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
   1914         # remove once script supports set_grad_enabled
   1915         _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)
-> 1916     return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
   1917 
   1918 

IndexError: index out of range in self

Still trying to get out of this…

PriyankaDilip · October 29, 2021, 12:47am

This worked for me using the length in the error message I was getting, thanks Maimonator!

AlexKay · November 24, 2021, 3:37pm

seems like this one fixed my problem! thanks

pchhapolika · March 9, 2023, 8:40am

Hi @AlexKay How did you fix this issue?

Topic		Replies	Views
Errors when fine-tuning T5 Beginners	7	6548	January 3, 2022
Bert strugling with Padded sentence 🤗Transformers	0	392	August 24, 2021
How to specify sequence length when using "feature-extraction" 🤗Transformers	3	1307	April 28, 2021
Cannot Start the training loop because of bad size tokenization and/or for (presumably) custom dataset settings Beginners	2	322	June 11, 2022
Token indices sequence length is longer than the specified maximum sequence length 🤗Tokenizers	4	23561	February 15, 2023

Error when Fine-tuning pretrained Masked Language Model

Related topics