What could be causing " line 51, in write_predictions_to_file if not preds_list[example_id]: IndexError: list index out of range" in token-classification?

I am running token classification on my data, and it runs just fine except for write_predictions_to_file in https://github.com/huggingface/transformers/blob/master/examples/token-classification/tasks.py#L51

I believe its and issue with my data because I run the script over other data sources without any issues.

It seems that it runs the predictions just fine, and the issue is with writing the predictions to file; there seems to be some sort of index mismatch. The full error is

Traceback (most recent call last):
  File "run_ner.py", line 308, in <module>
    main()
  File "run_ner.py", line 297, in main
    token_classification_task.write_predictions_to_file(writer, f, preds_list)
  File "/content/transformers/examples/token-classification/tasks.py", line 51, in write_predictions_to_file
    if not preds_list[example_id]:
IndexError: list index out of range

Specifically, the mismatch seems to be between pred_list and test_input_reader .

I’ve been looking at the difference between the data that has caused the error, and data that runs just fine, but I can’t seem to pick anything out. I was thinking maybe it was caused by several new lines in a row, or some other lining issue, but I haven’t seen it yet.

Anyone have an idea.

For convenience, I recreated the issue in this colab notebook.

Hi @reSearch2vec, I will have a look on it and report back here :slight_smile:

1 Like

Thanks, very much appreciated!