Hello guys,
I want to train a multilingual EncoderDecoderModel for translation.
model = EncoderDecoderModel.from_encoder_decoder_pretrained('xlm-roberta-base', 'EleutherAI/gpt-j-6B')
Now I’m having a hard time figuring out what input is needed. I have several parallel corpora in different languages - do I need to combine them all into one corpus?
Thank you!