It looks like you are not using the âfastâ version of the tokenizer. Check to make sure.
https://huggingface.co/transformers/model_doc/roberta.html#robertatokenizerfast
from transformers import RobertaTokenizerFast
tokenizer = RobertaTokenizerFast.from_pretrained("roberta-base")
tokenizer(âHello worldâ)[âinput_idsâ]
[0, 31414, 232, 328, 2]
tokenizer(" Hello world")[âinput_idsâ]
[0, 20920, 232, 2]