NER at the Inference Time

Apology in advance if this question might have already been asked. However, I have not been able to find a convincing answer or the optimum way to deal with this issue.

To my understanding, NER makes a prediction at a token level. Since BERT is using a sub-word tokenizer, it is entirely possible that some part of the word won’t get labeled or we have a different label within the same word. Both of these are undesirables because in the end we want the final result to be NER on a word, not token, level.

For example, see link.

Barr(PER) ien(O) tos(PER)

This should have been Barr(PER) ien(PER) tos(PER).

Another more confusing prediction here

F(MISC) abric (O) … Fat (LOC) pack Sweater (ORG)

We have inconsistent token production within the same words.

So my questions are the following

  1. How can I best convert token-level NER labels to word-level labels? What is the best policy to deal with inconsistent token-level prediction within the same word? Is there a standard way to do this? Has this already been implemented in the huggingface library.

  2. Should not there be a way to force the model to recognize that those three tokens came from the same word so they need to have the same token-level label in the first place?

Any suggestions are greatly appreciated. Thank you!