I am trying to train the LayoutLM model with a custom receipt dataset. However, when training the model, I have encountered this error:
/opt/conda/conda-bld/pytorch_1603729138878/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [160,0,0], thread: [125,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1603729138878/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [160,0,0], thread: [126,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1603729138878/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [160,0,0], thread: [127,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
Iteration: 2%|▋ | 3/140 [00:01<01:25, 1.61it/s]
Epoch: 0%| | 0/1 [00:01<?, ?it/s]
Traceback (most recent call last):
File "run_seq_labeling.py", line 832, in <module>
main()
File "run_seq_labeling.py", line 725, in main
args, train_dataset, model, tokenizer, labels, pad_token_label_id
File "run_seq_labeling.py", line 240, in train
outputs = model(**inputs)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/layoutlm/modeling/layoutlm.py", line 224, in forward
head_mask=head_mask,
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/layoutlm/modeling/layoutlm.py", line 178, in forward
input_ids, bbox, position_ids=position_ids, token_type_ids=token_type_ids
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/layoutlm/modeling/layoutlm.py", line 102, in forward
+ token_type_embeddings
RuntimeError: CUDA error: device-side assert triggered
This code was ran on a Kaggle environment, with GPU P100 turned on. What is causing the error? This code works on the SROIE dataset, however in the SROIE dataset, there are only 4 labels, while in this code, we increased it to 8 labels.