so BPE is not perfect, it even can’t get such a simple Chinese string’s good-enough segmentation result
I read the article u paste, it’s meaningful
but chat-gpt can reverse the word correctly, does not it use tokenizer?
1 Like
