Tokenizer mapping the same token to multiple token_ids

Hello,

I’m hoping to re-raise the question because I haven’t made any progress towards figuring this particular phenomenon out.

The HuggingFace guide on tokenizers seems to imply that tokenizers are expected to be consistent, which is not what I am experiencing.


Summary of the tokenizers by HuggingFace

If anyone even knows whether the phenomenon pointed out in the initial post is normal or abnormal, please let me know!