Trucated Inputs to our model

We have a model that automatically applies diacritics to text [Davlan/mT5_base_yoruba_adr · Hugging Face]. However, we noticed that any input longer than about 20 characters is truncated. For example:

Input: Mo je isu ati eyin ni Ibadan
Output: Mo jẹ́ iṣu àti ẹ̀yìn ní Ì
instead of
Mo jẹ́ iṣu àti ẹ̀yìn ní Ìbàdàn

How can we fix this?

1 Like