Calculate F1 score in a NER task with BERT

Sergio · January 15, 2021, 2:06pm

Hi everyone,
I fine tuned a BERT model to perform a NER task using a BILUO scheme and I have to calculate F1 score.
However, in named-entity recognition, f1 score is calculated per entity, not token.
Moreover, there is the Word-Piece “problem” and the BILUO format, so I should:

aggregate the subwords in words
remove the prefixes “B-”, “I-”, “L-” from each entity
calculate the F1 score on the entity

Before I spend hours (if not days) to try to implement such code, I would like to know if an implemented solution already exists.
Thanks in advance

sgugger · January 15, 2021, 2:21pm

You should use the datasets metric seqeval that will do all of this for you. Check the new run_ner script for an example.

Sergio · January 15, 2021, 3:51pm

Thanks for the hint @sgugger .
I have a question.
seqeval of datasets is the same implementation of this?
If yes, in my case this is a problem, because F1 score with BILOU format can be calculated only in strict mode, while I need the default one.

sgugger · January 15, 2021, 5:11pm

Ah yes, it’s the same so it won’t be useful to your use case, sorry.

Topic		Replies	Views
Confidence score for NER model Beginners	1	1467	May 31, 2023
Ask for help with prediction results of Named Entity Recognition Task 🤗Transformers	10	3228	May 21, 2021
Confidence score for NER using BERT Beginners	0	700	July 12, 2021
Text Classification tokenizer problems on inference Intermediate	4	2267	October 12, 2022
Annotate a NER dataset (for BERT) Beginners	3	1587	May 29, 2024

Calculate F1 score in a NER task with BERT

Related topics