Token Classification Label order

lfcc · November 11, 2022, 1:16am

Hello,
I have fine-tuned several BERT models in order to perform NER tasks.
In order to do so, I basically used a dataset that was annotated in IOB format.

Before using Hugging face with BERT, I was performing the NER task with a BI-LSTM-CRF model implemented in TensorFlow.

With my BILSTM model, I had a recurrent problem with the order of the labels:

sequence = ["Bruno", "Garcia", "is","a","great","person"]
prediction = ["I-Person","B-Person","O","O","O","O"]

As you can see, the labels are wrong and “invalid” since I-Person should always have a B-Person before it.

Using the AutoModelForTokenClassification, the finne tunned model never made that mistake.

So, my question is, how is the AutoModelForTokenClassification avoiding this problem?

Topic		Replies	Views
Inconsistency in Model Output [ Token Classification] 🤗Transformers	0	334	April 12, 2023
TFBertForTokenClassification scoring only O labels on a NER task Beginners	5	2311	January 14, 2021
How to structure labels for token classification? 🤗Transformers	5	3283	August 29, 2021
Token Classification Model making mistake outside of training dataset Intermediate	0	461	October 30, 2021
How to fine tune bert on entity recognition? Beginners	23	7364	November 21, 2022

Token Classification Label order

Related topics