NLLB 3.3B - Poor translations from Chinese to English

Hey!

I’m playing around with the NLLB 3.3B, and have encountered some problems when translating from (simplified) Chinese (FLORES language code: zho_Hans) to English that I’m curious to know if anyone else has seen.

The model reacts badly to city (e.g. Hanzhong) and company names (e.g. Midea)
Example translations:

  • Air Conditioning in Hanzhong City (汉中市空调) translates to Air conditioning in China
  • Hanzhong (汉中) translates to Chinese
  • Midea home central air conditioner (美的家用中央空调) translates to The house is central air conditioned

I don’t know any Chinese so the “correct” translations are from Google Translate.

The model is very sensitive to unexpected characters
Pipes, parenthesis, 0-9, a-z results in nonsense output when translating from Chinese. Played around with M2M which isn’t nearly as sensitive to these characters.

Have anyone else experienced this?

BR,
Felix

Encountering similar issues with Japanese to English translations using NLLB, and in some instances for Korean as well.

same experience with Chinese/Japanese/Korean to English.
It hallucinates a lot.

I think the translation for NLLB Chinese isn’t working as well as intended; please do correct me if I am utilizing it wrongly but…
苹果 to English; I’ve gotten 果 for the translation instead of apple
这里有西瓜 to English; I’ve gotten ‘There’s a little bit of a squirrel in there.’ instead of ‘There’s a watermelon here’
Anyone has any tips on how to improve the translation or a better model out there that I may have missed?