I am currently modifying my approach to an NER project (AKA re-annotating a bunch of data) as I believe more entity types identified by the model would reduce the amount of post-processing my application needs to do.
Basically right now it identifies drug and dosage, but with dosages like 1 tablet 2.5mg , it gets the two conflated, when one is a dose amount and the other is a strength, so I am reannotating to have the two separated.
However, I am a bit cautious regarding increasing the entity type amount as I am worried that it may cause a decrease in model performance, no obvious answer for this subject in the literature is apparent so I figured I’d ask the community.
1 Like
It seems certain that performance will decline when the number of classes is increased, but it seems that it will be manageable as long as there are not too many labels…
https://stackoverflow.com/questions/72162213/what-is-the-max-limit-of-entities-in-a-custom-ner-model
1 Like
Thank you John!
1 Like