NLLB 3.3B - Poor translations from Chinese to English

Hey!

I’m playing around with the NLLB 3.3B, and have encountered some problems when translating from (simplified) Chinese (FLORES language code: zho_Hans) to English that I’m curious to know if anyone else has seen.

The model reacts badly to city (e.g. Hanzhong) and company names (e.g. Midea)
Example translations:

  • Air Conditioning in Hanzhong City (汉中市空调) translates to Air conditioning in China
  • Hanzhong (汉中) translates to Chinese
  • Midea home central air conditioner (美的家用中央空调) translates to The house is central air conditioned

I don’t know any Chinese so the “correct” translations are from Google Translate.

The model is very sensitive to unexpected characters
Pipes, parenthesis, 0-9, a-z results in nonsense output when translating from Chinese. Played around with M2M which isn’t nearly as sensitive to these characters.

Have anyone else experienced this?

BR,
Felix

Encountering similar issues with Japanese to English translations using NLLB, and in some instances for Korean as well.

same experience with Chinese/Japanese/Korean to English.
It hallucinates a lot.