Issue with tokenizer.tokenize

when I test tokenizer.tokenize(‘how do you do’) (RobertaTokenizer in pytorch_transformers.tokenization_roberta.py ), it returns [‘how’, ‘Ġ’, ‘do’, ‘Ġ’, ‘you’, ‘Ġ’, ‘do’], wants to know where is the wrong

There is some discussion in this therad and this. Perhaps it helps?

See also this post in the forum.

where is the wrong

That’s a new one. I haven’t seen that expression before. LOL