How create BERT2Rand Encoder-Decoder model

There are multiple helpful references (https://colab.research.google.com/drive/1WIk2bxglElfZewOHboPFNj8H44_VAyKE?usp=sharing#scrollTo=6r2-M5hYt-Vw ) for creating instances of BERT2BERT and BERT2Share models. I was wondering how to create a BERT2Rand Encoder-Decoder model where the encoder parameters will be loaded from pre-trained checkpoint and decoder parameters should be randomly initialized?

By reading document and codes, I tried this
EncoderDecoderModel.from_encoder_decoder_pretrained(bert-base-multilingual-cased, None)
which gave fallowing error:
Huggingface AssertionError: If *decoder_model* is not defined as an argument, a *decoder pretrained model_name_or_path* has to be define

I am not sure how to fix this. Please let me in this. Thank you!

1 Like

Hi there,

if you want a randomly initialized decoder then you can create the decoder separately, save it, and then pass that to the from_encoder_decoder_pretrained. Following code snipper shows how you can init the encoder using pre-tarined bert-base and a randomly initialized decoder of bert-base size.

from transformers import BertConfig, BertLMHeadModel, EncoderDecoderModel

decoder_config = BertConfig(is_decoder=True)# bert-base size
decoder = BertLMHeadModel(decoder_config)
decoder.save_pretrained("decoder") # save the decoder

model = EncoderDecoderModel.from_encoder_decoder_pretrained("bert-base-uncased", "decoder")

# verify the decoder
model.decoder.config.is_decoder
model.decoder.config.add_cross_attention
1 Like

Thank you @valhalla! It is helpful information.