I have read the doc on building a tokenizer from scratch but i cannot find the information about multilingual tokenizer. Does anybody have any suggestions on this ?
I have read the doc on building a tokenizer from scratch but i cannot find the information about multilingual tokenizer. Does anybody have any suggestions on this ?