Error while using LILT model "index out of range in self"

Hello everyone,

I am trying to finetune the “nielsr/lilt-xlm-roberta-base” model
model = LiltForTokenClassification.from_pretrained("nielsr/lilt-xlm-roberta-base", num_labels=len(labels), label2id=label2id, id2label=id2label)

when i execute

training_args = TrainingArguments(
  output_dir=save_path,
  overwrite_output_dir=True,
  num_train_epochs=20,
  learning_rate=5e-5,
  evaluation_strategy="steps",
  eval_steps=100,
  # save_total_limit=1,
  load_best_model_at_end=True,
  metric_for_best_model="f1"
  )
trainer.train()

I obtain the following error. Can anyone provide an explanation or a solution ?
Thank you in advance

/usr/local/lib/python3.10/dist-packages/transformers/trainer.py in train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
   1660             self._inner_training_loop, self._train_batch_size, args.auto_find_batch_size
   1661         )
-> 1662         return inner_training_loop(
   1663             args=args,
   1664             resume_from_checkpoint=resume_from_checkpoint,

/usr/local/lib/python3.10/dist-packages/transformers/trainer.py in _inner_training_loop(self, batch_size, args, resume_from_checkpoint, trial, ignore_keys_for_eval)
   1927                         tr_loss_step = self.training_step(model, inputs)
   1928                 else:
-> 1929                     tr_loss_step = self.training_step(model, inputs)
   1930 
   1931                 if (

/usr/local/lib/python3.10/dist-packages/transformers/trainer.py in training_step(self, model, inputs)
   2697 
   2698         with self.compute_loss_context_manager():
-> 2699             loss = self.compute_loss(model, inputs)
   2700 
   2701         if self.args.n_gpu > 1:

/usr/local/lib/python3.10/dist-packages/transformers/trainer.py in compute_loss(self, model, inputs, return_outputs)
   2729         else:
   2730             labels = None
-> 2731         outputs = model(**inputs)
   2732         # Save past state if it exists
   2733         # TODO: this needs to be fixed and made cleaner later.

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
   1499                 or _global_backward_pre_hooks or _global_backward_hooks
   1500                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501             return forward_call(*args, **kwargs)
   1502         # Do not call functions when jit is used
   1503         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.10/dist-packages/transformers/models/lilt/modeling_lilt.py in forward(self, input_ids, bbox, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, labels, output_attentions, output_hidden_states, return_dict)
   1027         return_dict = return_dict if return_dict is not None else self.config.use_return_dict
   1028 
-> 1029         outputs = self.lilt(
   1030             input_ids,
   1031             bbox=bbox,

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
   1499                 or _global_backward_pre_hooks or _global_backward_hooks
   1500                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501             return forward_call(*args, **kwargs)
   1502         # Do not call functions when jit is used
   1503         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.10/dist-packages/transformers/models/lilt/modeling_lilt.py in forward(self, input_ids, bbox, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, output_attentions, output_hidden_states, return_dict)
    815         )
    816 
--> 817         layout_embedding_output = self.layout_embeddings(bbox=bbox, position_ids=position_ids)
    818 
    819         encoder_outputs = self.encoder(

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
   1499                 or _global_backward_pre_hooks or _global_backward_hooks
   1500                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501             return forward_call(*args, **kwargs)
   1502         # Do not call functions when jit is used
   1503         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.10/dist-packages/transformers/models/lilt/modeling_lilt.py in forward(self, bbox, position_ids)
    166 
    167         h_position_embeddings = self.h_position_embeddings(bbox[:, :, 3] - bbox[:, :, 1])
--> 168         w_position_embeddings = self.w_position_embeddings(bbox[:, :, 2] - bbox[:, :, 0])
    169 
    170         spatial_position_embeddings = torch.cat(

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in _call_impl(self, *args, **kwargs)
   1499                 or _global_backward_pre_hooks or _global_backward_hooks
   1500                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501             return forward_call(*args, **kwargs)
   1502         # Do not call functions when jit is used
   1503         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/sparse.py in forward(self, input)
    160 
    161     def forward(self, input: Tensor) -> Tensor:
--> 162         return F.embedding(
    163             input, self.weight, self.padding_idx, self.max_norm,
    164             self.norm_type, self.scale_grad_by_freq, self.sparse)

/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
   2208         # remove once script supports set_grad_enabled
   2209         _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)
-> 2210     return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
   2211 
   2212 

IndexError: index out of range in self

I have the same error, If you solve it could you share the solution with us, please?

Make sure to normalize your bounding boxes before passing them to the model.

LiLT requires the same normalization as LayoutLM, which is explained here: LayoutLM

1 Like

I normalized them to be in the range of 0 and 1000, and I had checked that bbox[:, :, 2] - bbox[:, :, 0] >0 and bbox[:, :, 3] - bbox[:, :, 1] >0

it started training but then the same error appears :"

I got the same issue, following the funsd tutorial notebook with my own dataset.

I just found that I have faulty bounding boxes in my data. Some input boxes (x0, y0, x1, y1) have x1 < x0 or y1 < y1. The height / width of these boxes would be negative and that seems to casue an error, at least that was reported for a similar issue with LayoutLM.

I will correct my dataset and post an update, if it solves the issue for me!

Update: It solved the Issue for me!