How create BERT2Rand Encoder-Decoder model

kaushal · March 15, 2021, 6:23am

There are multiple helpful references (https://colab.research.google.com/drive/1WIk2bxglElfZewOHboPFNj8H44_VAyKE?usp=sharing#scrollTo=6r2-M5hYt-Vw ) for creating instances of BERT2BERT and BERT2Share models. I was wondering how to create a BERT2Rand Encoder-Decoder model where the encoder parameters will be loaded from pre-trained checkpoint and decoder parameters should be randomly initialized?

By reading document and codes, I tried this
EncoderDecoderModel.from_encoder_decoder_pretrained(bert-base-multilingual-cased, None)
which gave fallowing error:
Huggingface AssertionError: If *decoder_model* is not defined as an argument, a *decoder pretrained model_name_or_path* has to be define

I am not sure how to fix this. Please let me in this. Thank you!

valhalla · March 15, 2021, 7:56am

Hi there,

if you want a randomly initialized decoder then you can create the decoder separately, save it, and then pass that to the from_encoder_decoder_pretrained. Following code snipper shows how you can init the encoder using pre-tarined bert-base and a randomly initialized decoder of bert-base size.

from transformers import BertConfig, BertLMHeadModel, EncoderDecoderModel

decoder_config = BertConfig(is_decoder=True)# bert-base size
decoder = BertLMHeadModel(decoder_config)
decoder.save_pretrained("decoder") # save the decoder

model = EncoderDecoderModel.from_encoder_decoder_pretrained("bert-base-uncased", "decoder")

# verify the decoder
model.decoder.config.is_decoder
model.decoder.config.add_cross_attention

kaushal · March 16, 2021, 10:26am

Thank you @valhalla! It is helpful information.

Topic		Replies	Views
Can we use a random state Bert model in BertGeneration? 🤗Transformers	0	411	June 14, 2023
BART from finetuned BERT Intermediate	2	472	September 9, 2021
Warm-started encoder-decoder models (Bert2Gpt2 and Bert2Bert) Beginners	11	2492	June 9, 2024
Load EncoderDecoderModel from a checkpoint Models	0	294	March 9, 2023
How to train an EncoderDecoderModel with different pretrained encoder and decoder? 🤗Transformers	2	418	April 2, 2024

How create BERT2Rand Encoder-Decoder model

Related topics