[EncoderDecoder] Parameter sharing

Is it possible to use EncoderDecoderModel with encoder decoder parameter sharing as used in https://arxiv.org/abs/1907.12461 (ROBERTASHARE, BERTSHARE)? Model still achieves state-of-art results while it’s more affordable to train it. Or maybe there are some plans to implement this?

Yes ! This can be done using

roberta2roberta = EncoderDecoderModel.from_encoder_decoder_pretrained("roberta-base", "roberta-base", tie_encoder_decoder=True)

Here’s an example model