Tokenizer producing token index greater than size of the dictionary

I am using a tokenizer with a vocab size of 30522, but the tokenized dataset has a token with id 50,000 and above. Is that possible? What might I be doing wrong?