Token Classification Label order

I have fine-tuned several BERT models in order to perform NER tasks.
In order to do so, I basically used a dataset that was annotated in IOB format.

Before using Hugging face with BERT, I was performing the NER task with a BI-LSTM-CRF model implemented in TensorFlow.

With my BILSTM model, I had a recurrent problem with the order of the labels:

sequence = ["Bruno", "Garcia", "is","a","great","person"]
prediction = ["I-Person","B-Person","O","O","O","O"]

As you can see, the labels are wrong and “invalid” since I-Person should always have a B-Person before it.

Using the AutoModelForTokenClassification, the finne tunned model never made that mistake.

So, my question is, how is the AutoModelForTokenClassification avoiding this problem?