Translation model to 100+ Languages

Hello everyone,

I am currently engaged in a project that involves translating text from English into more than 100 languages. The translation quality must be on par with services such as DeepL or Google Translate.

Is there a model that fulfills these criteria and can be operated locally without relying on external APIs? Furthermore, does this model have the capability to translate HTML source code?

It would be preferable if the model is compatible with Python, as that is my main working environment.

Thank you in advance for any assistance and advice.

2 Likes

I did a search with 9 randomly selected languages as criteria. I think Mistral Nemo and Qwen 2.5 have good translation performance in actual use, but it would be difficult to support 100 languages as it is. It should be necessary to train them somehow.

For an application that translates 100 languages, it’s going to be difficult with a small model like 8B. But it is very difficult to operate a large model locally.
If you have enough GPU, you can download and run a 70B model or even a 405B model…

In my experience, I don’t have much of a problem with 4-bit quantized models when performing translation and inference, so I’d suggest finding a model that feels good first, and then quantize it for actual use.
If you want to try a large model, the following space released recently is useful.

Ok. Thanks, @John6666! I will try your suggestions!

1 Like

Hii @Danny28 were you able to test, how was the quality? Even I am looking into En-to-X translation and needed recommends on good models (doesn’t have to be a single model).

1 Like

Hey group I would be interested in research in this domain. Please ping me if there is a driven initiative for this.

1 Like