Fine-tuning of multilingual (translation) models

chasb799 · May 8, 2023, 11:33am

Hi guys,

I want to fine-tune pre-trained multilingual Models (MarianMT in this case) for domain-specific translation. I want the models to be able to translate between 5 different languages. I have domain-specific datasets for every sentence pair (e.g. de-en, en-de, de-es, es-de and so on). In the tutorials for fine-tuning I only could find fine-tuning for single language pairs (e.g. only the pretrained “Helsinki-NLP/opus-mt-en-roa” model is downloaded ) which then needs to be fine-tuned on en-roa datasets. What I want to do is to train the whole multilingual model (not just en-roa). I want to mix the sentences of all sentence pairs of my datasets into one dataset and fine-tune the whole multilingual model on this huge dataset. How can I achieve this task? Is it possible to download the “whole” model and not just the language pair models like en-roa? I hope someone can help me

Best regards,

Simon

allandclive · August 17, 2023, 9:25am

Try this GitHub - masakhane-io/lafand-mt: MAFAND-MT

Topic		Replies	Views
Fine-tune a translation model on monolingual data Intermediate	1	434	June 16, 2022
Finetune a pretrained huggingface translation model on a new language pair Models	1	1037	January 12, 2024
Multilingual multiple languages fine-tuning on facebook mms model Models	2	676	July 17, 2024
Finetune different language pair on pretrained translation model Models	1	957	May 26, 2022
Translation architectures fine-tunable on a new language Models	0	407	October 3, 2023

Fine-tuning of multilingual (translation) models

Related topics