lol. That depends on the tokenizer you’re using.
Check out ‘rs-bpe’ on PyPI / GitHub. It currently outperforms both tiktoken and tokenizers.
lol. That depends on the tokenizer you’re using.
Check out ‘rs-bpe’ on PyPI / GitHub. It currently outperforms both tiktoken and tokenizers.