Continued (in-domain) Pre-training of BART

Is there any guidance on this? I tried pre-training in Fairseq but had issues later porting over the BART weights into the LongFormer on Huggingface so would love to code it all in HuggingFace.

If no scripts are available, should I just use the Seq2SeqTrainer and do the denoising myself inside a custom dataset class?

Thanks :slight_smile:

Did you happen to figure out how to do continued pre-training on BART?