I find llama3 using tiktoken and here is the difference introduced by huggingface:
The tokenizer is a BPE model based on tiktoken (vs the one based on sentencepiece implementation for Llama2). The main difference that it ignores BPE merge rules when an input token is part of the vocab. This means that if no merge exist to produce “hugging”, instead of having the smallest units, like [“hug”,“ging”] form 2 tokens, if “hugging”` is part of the vocab, it will be automatically returned as a token.
I don’t quite understand this description. Is there a more detailed explanation?