I want to train transformer TF model for NER with my pipeline. I have a problem with alignment of labels. As I understand for this task one uses DataCollatorForTokenClassification. But I can’t figure out how to use it outside of Trainer to get aligned labels.
Just to clearify what do I mean:
tokens: [‘Europe’,‘is’,‘international’]
labels: [‘1’,‘0’.‘0’]
input_ids: [‘545’,‘43’,‘6343’,‘2334’,‘2’]