Input for multilingual EncoderDecoderModel

Hello guys,

I want to train a multilingual EncoderDecoderModel for translation.

model = EncoderDecoderModel.from_encoder_decoder_pretrained('xlm-roberta-base', 'EleutherAI/gpt-j-6B')

Now I’m having a hard time figuring out what input is needed. I have several parallel corpora in different languages - do I need to combine them all into one corpus?

Thank you!