Index out of range layoutlm

juvebogdan · December 4, 2020, 7:46am

I am trying to fine tune LayoutLm for SROIE receipt named entity extraction. I checked the github page of Layoutlm and used their run_seq_labelling.py and preprocess.py on this new dataset i prepared but i am receiving following error:

Iteration:   4%|█████▉                                                                                                                                                             | 21/577 [00:53<23:45,  2.56s/it]
Epoch:   0%|                                                                                                                                                                                | 0/100 [00:53<?, ?it/s]
Traceback (most recent call last):
  File "run_seq_labeling.py", line 812, in <module>
    main()
  File "run_seq_labeling.py", line 705, in main
    args, train_dataset, model, tokenizer, labels, pad_token_label_id
  File "run_seq_labeling.py", line 220, in train
    outputs = model(**inputs)
  File "/home/ml3/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ml3/.local/lib/python3.6/site-packages/layoutlm/modeling/layoutlm.py", line 221, in forward
    head_mask=head_mask,
  File "/home/ml3/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ml3/.local/lib/python3.6/site-packages/layoutlm/modeling/layoutlm.py", line 171, in forward
    input_ids, bbox, position_ids=position_ids, token_type_ids=token_type_ids
  File "/home/ml3/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ml3/.local/lib/python3.6/site-packages/layoutlm/modeling/layoutlm.py", line 82, in forward
    bbox[:, :, 2] - bbox[:, :, 0]
  File "/home/ml3/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ml3/.local/lib/python3.6/site-packages/torch/nn/modules/sparse.py", line 126, in forward
    self.norm_type, self.scale_grad_by_freq, self.sparse)
  File "/home/ml3/.local/lib/python3.6/site-packages/torch/nn/functional.py", line 1814, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self

I am using transformers 2.9 as the github page states as a requirement

juvebogdan · December 4, 2020, 2:12pm

I found the issue. Turns out that OCR detects vertical text and in that case width comes up as negative

keshav5196 · March 9, 2021, 9:06am

Hi,

I am facing the same index out of range problem. Can you elaborate a little more on your solution.

Thank you.

juvebogdan · March 9, 2021, 9:38am

@keshav5196 Check your bounding boxes. In my case, I found out that some of the handwritten text was vertical. Layoutlm doesn’t like that. Just remove boxes like that

keshav5196 · March 9, 2021, 10:55am

First to be clear by vertical you mean 90 degrees rotated right? If a word is vertical how the bounding box would be affected?

By the way I checked my data and I didn’t found any vertical words in a image.

keshav5196 · March 10, 2021, 9:24am

Solved my problem. It was due to some negative width or height.

For example if input box was like (x0, y0, x1, y1). Here if y1-y0 or x1-x0 was negative then LayoutLM will throw error.

On GPU error will be, CUDA error: device-side assert triggered.
On CPU error wiil be, Index error: index out of range in self

Topic		Replies	Views
LayoutLMv3 processor error Intermediate	4	117	September 27, 2024
[LayoutLMv3] index out of range in self inside outputs = model(**encoding) Models	4	2737	May 10, 2024
LayoutXLM training - index out of bounds: 0 <= tmp30 < 1L Beginners	0	10	September 3, 2024
Error while using LILT model "index out of range in self" 🤗Transformers	5	703	March 14, 2024
LayoutLMv3 inference - bboxes are incorrect 🤗Transformers	0	116	May 10, 2024

Index out of range layoutlm

Related topics