BART with custom encoder and decoder

Hi, I am currently working on german abstractive summarisation. My goal is to have a custom style abstractive summarisation model that learned a certain style of summarisation.

I was therefore thinking if it is possible to train a German or multilingual GPT-2 model for language modeling and inserting it into a BART model as the decoder and a fine tuned BERT model as encoder.

BART always states to use a “GPT-like decoder” and “BERT-like encoder”, which made me wonder if the models are exchangeable. That would allow me use a GPT-2 decoder trained on the specific task and in the desired language and a German BERT model as encoder.

Does anyone have any experience on doing so? Also if only one part is exchangeable (decoder or encoder) this would also be very interesting.
I am happy for any leads.


By simply calling model.decoder = gpt_model I would add the decoder to the model but not exchanging it, is that correct?