M2M model finetuning on multiple language pairs

nikhiljais · December 29, 2021, 12:26pm

Hi all, can anyone please help in suggesting how to finetune m2m100 on more than one pair? I am able to finetune for one lang pair using the below script:

CUDA_VISIBLE_DEVICES=0,1,2,3,6 python -m torch.distributed.run --nproc_per_node=5 run_translation.py --model_name_or_path=m2m100_418M_new_token --do_train --do_eval --source_lang ja --target_lang en --fp16=True --evaluation_strategy epoch --output_dir bigfrall --per_device_train_batch_size=48 --per_device_eval_batch_size=48 --overwrite_output_dir --forced_bos_token “en” --train_file orig_manga/orig/train_exp_frame_50k.json --validation_file orig_manga/orig/valid_exp_frame_50k.json --tokenizer_name tokenizer_new_token --num_train_epochs 50 --save_total_limit=5 --save_strategy=epoch --load_best_model_at_end=True --predict_with_generate

But, now I want to finetune it on ja-en and ja-zh pairs. How to pass these both languages?

nfortescue · January 26, 2022, 10:23am

I’m also curious about this. @nikhiljais - did you ever work this out?

nikhiljais · January 26, 2022, 1:10pm

not yet. waiting for some help

nfortescue · February 3, 2022, 11:59am

I think I managed to do this, but my way of doing it is really hacky and fragile so I wouldn’t recommend it. I’ve filed a feature request with the huggingface transformers team to improve this at https://github.com/huggingface/transformers/issues/15500

That feature request has a link to a Colab notebook with the code for how I did it. I believe it is working, but I’m not 100% sure.

anzorq · August 17, 2022, 9:56am

Hi, @nikhiljais.

I’m interested to know if fine-tuning on one pair affected the quality of other translation directions in your case?

I’m fine-tuning on a different lang pair and that pair works well, but all other directions don’t work at all.

Topic		Replies	Views
M2m-100 finetuning Models	4	3226	November 23, 2022
Finetune different language pair on pretrained translation model Models	1	956	May 26, 2022
Fine-tuning M2M100 & Mbartcc25 for Machine Translation OnetoMany Models	2	983	November 23, 2022
Conversion from finetune m2m_100 model to huggingface format 🤗Transformers	0	111	April 22, 2024
M2M100 training does not improve model performance 🤗Transformers	0	302	September 29, 2022

M2M model finetuning on multiple language pairs

Related topics