Error while using LILT model "index out of range in self"

arraism · May 28, 2023, 8:53pm

Hello everyone,

I am trying to finetune the “nielsr/lilt-xlm-roberta-base” model
model = LiltForTokenClassification.from_pretrained("nielsr/lilt-xlm-roberta-base", num_labels=len(labels), label2id=label2id, id2label=id2label)

when i execute

training_args = TrainingArguments(
  output_dir=save_path,
  overwrite_output_dir=True,
  num_train_epochs=20,
  learning_rate=5e-5,
  evaluation_strategy="steps",
  eval_steps=100,
  # save_total_limit=1,
  load_best_model_at_end=True,
  metric_for_best_model="f1"
  )
trainer.train()

I obtain the following error. Can anyone provide an explanation or a solution ?
Thank you in advance

/usr/local/lib/python3.10/dist-packages/transformers/trainer.py in train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
   1660             self._inner_training_loop, self._train_batch_size, args.auto_find_batch_size
   1661         )
-> 1662         return inner_training_loop(
   1663             args=args,
   1664             resume_from_checkpoint=resume_from_checkpoint,

/usr/local/lib/python3.10/dist-packages/transformers/trainer.py in _inner_training_loop(self, batch_size, args, resume_from_checkpoint, trial, ignore_keys_for_eval)
   1927                         tr_loss_step = self.training_step(model, inputs)
   1928                 else:
-> 1929                     tr_loss_step = self.training_step(model, inputs)
   1930 
   1931                 if (

/usr/local/lib/python3.10/dist-packages/transformers/trainer.py in training_step(self, model, inputs)
   2697 
   2698         with self.compute_loss_context_manager():
-> 2699             loss = self.compute_loss(model, inputs)
   2700 
   2701         if self.args.n_gpu > 1:

/usr/local/lib/python3.10/dist-packages/transformers/trainer.py in compute_loss(self, model, inputs, return_outputs)
   2729         else:
   2730             labels = None
-> 2731         outputs = model(**inputs)
   2732         # Save past state if it exists
   2733         # TODO: this needs to be fixed and made cleaner later.

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
   1499                 or _global_backward_pre_hooks or _global_backward_hooks
   1500                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501             return forward_call(*args, **kwargs)
   1502         # Do not call functions when jit is used
   1503         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.10/dist-packages/transformers/models/lilt/modeling_lilt.py in forward(self, input_ids, bbox, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, labels, output_attentions, output_hidden_states, return_dict)
   1027         return_dict = return_dict if return_dict is not None else self.config.use_return_dict
   1028 
-> 1029         outputs = self.lilt(
   1030             input_ids,
   1031             bbox=bbox,

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
   1499                 or _global_backward_pre_hooks or _global_backward_hooks
   1500                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501             return forward_call(*args, **kwargs)
   1502         # Do not call functions when jit is used
   1503         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.10/dist-packages/transformers/models/lilt/modeling_lilt.py in forward(self, input_ids, bbox, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, output_attentions, output_hidden_states, return_dict)
    815         )
    816 
--> 817         layout_embedding_output = self.layout_embeddings(bbox=bbox, position_ids=position_ids)
    818 
    819         encoder_outputs = self.encoder(

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
   1499                 or _global_backward_pre_hooks or _global_backward_hooks
   1500                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501             return forward_call(*args, **kwargs)
   1502         # Do not call functions when jit is used
   1503         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.10/dist-packages/transformers/models/lilt/modeling_lilt.py in forward(self, bbox, position_ids)
    166 
    167         h_position_embeddings = self.h_position_embeddings(bbox[:, :, 3] - bbox[:, :, 1])
--> 168         w_position_embeddings = self.w_position_embeddings(bbox[:, :, 2] - bbox[:, :, 0])
    169 
    170         spatial_position_embeddings = torch.cat(

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
   1499                 or _global_backward_pre_hooks or _global_backward_hooks
   1500                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501             return forward_call(*args, **kwargs)
   1502         # Do not call functions when jit is used
   1503         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/sparse.py in forward(self, input)
    160 
    161     def forward(self, input: Tensor) -> Tensor:
--> 162         return F.embedding(
    163             input, self.weight, self.padding_idx, self.max_norm,
    164             self.norm_type, self.scale_grad_by_freq, self.sparse)

/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
   2208         # remove once script supports set_grad_enabled
   2209         _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)
-> 2210     return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
   2211 
   2212 

IndexError: index out of range in self

Besrour · February 25, 2024, 3:40pm

I have the same error, If you solve it could you share the solution with us, please?

nielsr · February 25, 2024, 8:24pm

Make sure to normalize your bounding boxes before passing them to the model.

LiLT requires the same normalization as LayoutLM, which is explained here: LayoutLM

Besrour · February 26, 2024, 12:11pm

I normalized them to be in the range of 0 and 1000, and I had checked that bbox[:, :, 2] - bbox[:, :, 0] >0 and bbox[:, :, 3] - bbox[:, :, 1] >0

it started training but then the same error appears :"

DieseKartoffel · March 10, 2024, 9:19pm

I got the same issue, following the funsd tutorial notebook with my own dataset.

I just found that I have faulty bounding boxes in my data. Some input boxes (x0, y0, x1, y1) have x1 < x0 or y1 < y1. The height / width of these boxes would be negative and that seems to casue an error, at least that was reported for a similar issue with LayoutLM.

I will correct my dataset and post an update, if it solves the issue for me!

DieseKartoffel · March 14, 2024, 2:57pm

Update: It solved the Issue for me!

Topic		Replies	Views
Index out of range in self while using the LILT Model 🤗Transformers	1	277	February 25, 2024
[LayoutLMv3] index out of range in self inside outputs = model(**encoding) Models	4	2752	May 10, 2024
Index out of range layoutlm Beginners	5	1945	March 10, 2021
IndexError: tuple index out of range Beginners	0	1444	October 21, 2021
Error in fine-tuning BERT Beginners	8	6285	February 21, 2022

Error while using LILT model "index out of range in self"

Related topics