I’m playing around with the NLLB 3.3B, and have encountered some problems when translating from (simplified) Chinese (FLORES language code: zho_Hans) to English that I’m curious to know if anyone else has seen.
The model reacts badly to city (e.g. Hanzhong) and company names (e.g. Midea)
- Air Conditioning in Hanzhong City (汉中市空调) translates to Air conditioning in China
- Hanzhong (汉中) translates to Chinese
- Midea home central air conditioner (美的家用中央空调) translates to The house is central air conditioned
I don’t know any Chinese so the “correct” translations are from Google Translate.
The model is very sensitive to unexpected characters
Pipes, parenthesis, 0-9, a-z results in nonsense output when translating from Chinese. Played around with M2M which isn’t nearly as sensitive to these characters.
Have anyone else experienced this?