I am running token classification on my data, and it runs just fine except for
write_predictions_to_file in https://github.com/huggingface/transformers/blob/master/examples/token-classification/tasks.py#L51
I believe its and issue with my data because I run the script over other data sources without any issues.
It seems that it runs the predictions just fine, and the issue is with writing the predictions to file; there seems to be some sort of index mismatch. The full error is
Traceback (most recent call last): File "run_ner.py", line 308, in <module> main() File "run_ner.py", line 297, in main token_classification_task.write_predictions_to_file(writer, f, preds_list) File "/content/transformers/examples/token-classification/tasks.py", line 51, in write_predictions_to_file if not preds_list[example_id]: IndexError: list index out of range
Specifically, the mismatch seems to be between pred_list and test_input_reader .
I’ve been looking at the difference between the data that has caused the error, and data that runs just fine, but I can’t seem to pick anything out. I was thinking maybe it was caused by several new lines in a row, or some other lining issue, but I haven’t seen it yet.
Anyone have an idea.
For convenience, I recreated the issue in this colab notebook.