Hello,
I have just stambled upon the “Warm-starting BERT2BERT for CNN/Dailymail” Notebook @patrickvonplaten kindly shared with us: https://github.com/patrickvonplaten/notebooks/blob/master/BERT2BERT_for_CNN_Dailymail.ipynb
From my understanding, one can use this notebook for fine-tuning any model ForConditionalGeneration, such as T5, Pegasus, ProphetNet, etc. Can you please confirm this?
Moreover, I am interested in the new Longformer and Reformer models, which you can feed much longer sequences. These two models do not have a ForConditionalGeneration class. However, I was wondering if they could be fine-tuned on summarization tasks using the same script, e.g. changing
from transformers import EncoderDecoderModel
bert2bert = EncoderDecoderModel.from_encoder_decoder_pretrained(“bert-base-uncased”, “bert-base-uncased”)
With
from transformers import LongformerModel, ReformerModel
longformer = LongformerModel.from_pretrained(“allenai/longformer-base-4096”)
reformer = ReformerModel.from_pretrained(“google/reformer-enwik8”)
Thank you for your help!