Hello,
I understand that when I add the add_prefix_space=True
option in the BPE tokenizer statement, the tokenizer will add a space in the beginning of every sequence.
Is there some specific advantages of using the add_prefix_space=True
option for BPE tokenizer (compared to when I don’t use the option)? All my sequences start without a space in the beginning.
Thanks,