I am working on a thesis to create a neural machine translation to translate the Myaamia language to English, it is just a unidirectional translation, i.e., we don’t need to go from Myaamia to English, only from English to Myaamia. AFAIK, Myaamia, being a lesser-known and used language, does not really match with any of the trained models on Huggingface.
I tried using some models available in Huggingface to try to see if I could get some results, but the results were not good at all. I achieved a BLEU score of 0.18 (out of 100) with the T5 model. Since I am (relatively) new to Huggingface and transformers, I would really appreciate it if anyone can guide me in the right direction. E.g., what model I should be using, and how should I format my data to get the best results?
Here is the link to the available data that I have so far:
- Total: 61559
- Train: 49247 (80%)
- Test: 6156 (10%)
- Validation: 6156 (10%)