Hi everyone,
I would like to use an already pre-train version of BERT model, and run additional steps of pre-training on a domain specific dataset (english learners dataset).
During the pre-training, I would like to use the Masked Language Modeling objective, as well as another custom objective function (classification on CEFR levels).
I am looking for any help on how to add another objective function (the total loss will be the sum of the two losses), with the associated architecture modifications (typically I need some more output nodes for the classification task).
Thanks in advance,
Yann