I attempted to train the BPE Hugging Face tokenizer for Pashto from scratch, but it is not decoding words correctly.
No one can help you with such limited information.
- I am unfamiliar with Pashto. Does the official documentation say anything about BPE?
- Is it possible that Pashto has already been pre trained with a different tokenizer?
- Post your code so we can understand what the software is trying to tell you.
1 Like
I solve the issues.