RuntimeError: CUDA error: device-side assert triggered in training LayoutLM

I am trying to train the LayoutLM model with a custom receipt dataset. However, when training the model, I have encountered this error:

/opt/conda/conda-bld/pytorch_1603729138878/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [160,0,0], thread: [125,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1603729138878/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [160,0,0], thread: [126,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/opt/conda/conda-bld/pytorch_1603729138878/work/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [160,0,0], thread: [127,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
Iteration:   2%|â–‹                               | 3/140 [00:01<01:25,  1.61it/s]
Epoch:   0%|                                              | 0/1 [00:01<?, ?it/s]
Traceback (most recent call last):
  File "run_seq_labeling.py", line 832, in <module>
    main()
  File "run_seq_labeling.py", line 725, in main
    args, train_dataset, model, tokenizer, labels, pad_token_label_id
  File "run_seq_labeling.py", line 240, in train
    outputs = model(**inputs)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/layoutlm/modeling/layoutlm.py", line 224, in forward
    head_mask=head_mask,
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/layoutlm/modeling/layoutlm.py", line 178, in forward
    input_ids, bbox, position_ids=position_ids, token_type_ids=token_type_ids
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/layoutlm/modeling/layoutlm.py", line 102, in forward
    + token_type_embeddings
RuntimeError: CUDA error: device-side assert triggered

This code was ran on a Kaggle environment, with GPU P100 turned on. What is causing the error? This code works on the SROIE dataset, however in the SROIE dataset, there are only 4 labels, while in this code, we increased it to 8 labels.

2 Likes