Pre-Training MBART/MBART50 from Scratch in HuggingFace

I have my own custom dataset that I need to pre-train a MBART/MBART50 type architecture from scratch. I have already tried the tutorial (given here ), but it difficult to extend the same concept to train to MBART/MBART50. The doubts that I have :

  1. I need to have the custom trained tokenizer. While the tokenizer used in MBART/MBART50 architecture is BPE(multi-lingual), following the guide here for training BPE tokenizer does not work for me(while training the following error is produced : MBart.from_pretrained() got an additional argument 'labels')

  2. How do we train with multilingual dataset, or pass it to LineByLineDataset ? I couldn’t find any reference on how to do it?

  3. Since my task further is to use adapters with the model, training with hugging face is kinda given. I can’t use other libraries which do not have built in support for adapters(not full support)

Any help/reference would be really really really helpful. I can clarify more if this conversation is continued !!