NER tag , aggregation stratergy

prashanth · January 28, 2022, 7:38pm

I’m trying to do NER tagging, I have been using the pipeline to predict the output of my models,

issue: aggregation stratergy=" simple" does a good job but the tags are grouped. How can I avoid the tags being grouped? I want tags like I-PER, not PER, on the other hand, I tried the aggregation strategy “none”, here the tags are generated the way I want but words are split due to tokenization.

@sgugger can you please help me out with this?

prashanth · January 28, 2022, 8:57pm

I have tried, the first and max options as well, but of no use

StivenLancheros · February 1, 2022, 6:38pm

Maybe I am not understanding your question correctly, but aggregation_strategy is used to group the entities in the predictions. So removing aggregation_stategy should give you the BIO (ES) tags individually.

Take a look at the code:

Pipeline Token Classification

use aggregation_strategy instead. Whether or not to group the tokens corresponding to the same entity together in the predictions or not.
aggregation_strategy (str, optional, defaults to "none"):
The strategy to fuse (or not) tokens based on the model prediction.

If you meant something different please let me know.

Topic		Replies	Views
Bug? Pipeline is discarding some of the predictions 🤗Transformers	0	89	March 26, 2024
Support for BILOU tags in aggregation_strategy Beginners	0	85	May 29, 2024
NER pipeline aggregation for BILOU 🤗Transformers	1	1904	December 4, 2021
NER - aggregation_strategy Intermediate	1	1396	January 24, 2024
TokenClassificationPipeline produce entities with "##" characters 🤗Transformers	6	25	May 19, 2025

NER tag , aggregation stratergy

Related topics