About Generative task

Hi! I’m currently working on tuning GPT-2 for multilingual to English translation tasks, my first thought is to use an XLM-RoBERTa as encoder and GPT-2 as decoder, feeding GPT-2 embeddings of multilingual texts processed by XLM-R, like this:

class XLM2GPT2(nn.Module):
    def __init__(self, config):
        self.encoder = XLMRobertaModel.from_pretrained(config['encoder'])
        self.decoder = GPT2LMHeadModel.from_pretrained(config['decoder'])

    def forward(self, ids, attns, labels):
        txt_embeds = self.encoder(ids, attns)
        output = self.decoder(input_embeds=txt_embeds, labels=labels)
        return output.loss

I don’t know if this is the right way to build a model and do generative training, is a structure like EncoderDecoderModel a better way to implement it?
Thanks in advance!