Converting TikToken to Huggingface Tokenizer

Hiya! We’ve trained a model using the TikToken cl100k_base tokenizer. However, we want to convert it into a capabitible huggingface model + tokenizer. We’ve got the model converted, but we aren’t sure how to go about converting the TikToken tokenizer to one that works in the huggingface ecosystem. Any help would be greatly appreciated!

7 Likes