Additional pre-training objective function

Hi everyone,

I would like to use an already pre-train version of BERT model, and run additional steps of pre-training on a domain specific dataset (english learners dataset).

During the pre-training, I would like to use the Masked Language Modeling objective, as well as another custom objective function (classification on CEFR levels).

I am looking for any help on how to add another objective function (the total loss will be the sum of the two losses), with the associated architecture modifications (typically I need some more output nodes for the classification task).

Thanks in advance,

Yann