I’m trying to do NER tagging, I have been using the pipeline to predict the output of my models,
issue: aggregation stratergy=" simple" does a good job but the tags are grouped. How can I avoid the tags being grouped? I want tags like I-PER, not PER, on the other hand, I tried the aggregation strategy “none”, here the tags are generated the way I want but words are split due to tokenization.
@sgugger can you please help me out with this?
I have tried, the first and max options as well, but of no use
Maybe I am not understanding your question correctly, but aggregation_strategy is used to group the entities in the predictions. So removing aggregation_stategy should give you the BIO (ES) tags individually.
Take a look at the code:
Pipeline Token Classification
aggregation_strategy instead. Whether or not to group the tokens corresponding to the same entity together in the predictions or not.
str, optional, defaults to
The strategy to fuse (or not) tokens based on the model prediction.
If you meant something different please let me know.