Having fine-tunning as well pre-training together as multi-task

Hi

I am interested in having MLM task as well as classification task as single setup .

any leads ?

I think it would be simpler to do MLM first and then classification. Is there any reason why you need to define the model with both heads at once?

It is certainly possible (in native pytorch or native tensorflow) to define two different pathways through a model.

When you say “pre-training”, do you mean that you want to train a model from scratch, or are you going to start with a pre-trained model and then do multiple further training steps?

Sorry for the using the ambiguous term . I meant using existing weights from already-trained model.