Hi all, can anyone please help in suggesting how to finetune m2m100 on more than one pair? I am able to finetune for one lang pair using the below script:
CUDA_VISIBLE_DEVICES=0,1,2,3,6 python -m torch.distributed.run --nproc_per_node=5 run_translation.py --model_name_or_path=m2m100_418M_new_token --do_train --do_eval --source_lang ja --target_lang en --fp16=True --evaluation_strategy epoch --output_dir bigfrall --per_device_train_batch_size=48 --per_device_eval_batch_size=48 --overwrite_output_dir --forced_bos_token “en” --train_file orig_manga/orig/train_exp_frame_50k.json --validation_file orig_manga/orig/valid_exp_frame_50k.json --tokenizer_name tokenizer_new_token --num_train_epochs 50 --save_total_limit=5 --save_strategy=epoch --load_best_model_at_end=True --predict_with_generate
But, now I want to finetune it on ja-en and ja-zh pairs. How to pass these both languages?