Can we use a random state Bert model in BertGeneration?

AliMGH · June 14, 2023, 11:04am

I want to create an EncoderDecoderModel for a translation task using a Bert2Bert configuration, where the encoder model is pre-trained and frozen, and the decoder model is randomly initialized. In the BertGeneration document, it says:

We developed a Transformer-based sequence-to-sequence model that is compatible with publicly available pre-trained BERT, GPT-2 and RoBERTa checkpoints and conducted an extensive empirical study on the utility of initializing our model, both encoder and decoder, with these checkpoints.

Can I use a new randomly initialized model in the BertGenerationDecoder? or it is better to use BertGeneration with pre-trained models?

model_config = BertConfig(
    vocab_size=tokenizer.vocab_size,
    hidden_size=hidden_size,
    add_cross_attention=True,
    is_decoder=True,
    bos_token_id=tokenizer.cls_token_id,
    eos_token_id=tokenizer.sep_token_id,
)
model = BertGenerationDecoder(config=model_config)

Topic		Replies	Views
How create BERT2Rand Encoder-Decoder model Models	2	1089	March 16, 2021
BART from finetuned BERT Intermediate	2	472	September 9, 2021
Difference between EncoderDecoder and BertGeneration Beginners	0	220	August 12, 2023
Warm-started encoder-decoder models (Bert2Gpt2 and Bert2Bert) Beginners	11	2492	June 9, 2024
EnocederDecoder training/prediction with two tokenizers Beginners	1	780	October 22, 2024

Can we use a random state Bert model in BertGeneration?

Related topics