First of all, I’m new to Huggingface so I’m hoping my question won’t be sounded foolish
I’m doing some research about the effects of pre-training tasks by changing BERT’s. Thus, I need to pre-train BERT from the scratch with different (besides from NSP and MLM) multiple or singular pre-training tasks. So basically I will create a Transformer model with the same architectural properties with BERT and train it on different tasks.
As far as I know, examples are all about pre-training BERT using MLM (with/without NSP).
Is it possible to use the BERT model in Huggingface with different pre-training tasks such as PLM, DAE etc ? Also I will evaluate those different models for BertForNextSentencePrediction, BertForMaskedLM, BertForSequenceClassification, BertForTokenClassification. Even if I use a different model class, can I still use it to BertFor… tasks ?
How can I train BERT with multiple tasks that are different from MLM + NSP such as PLM+SOP etc ?