Which model can use to pre-train a BERT model?

MahdiA · December 22, 2021, 2:29am

I am going to do pre-train a BERT model on specific dataset aiming for Sentiment Analysis.
To self-train the model, which method will be better to use: Masked Language Modeling or Next Sentence Prediction? Or maybe there is not specific answer.

osanseviero · December 22, 2021, 1:44pm

Choosing depends on what you want to do.

Using masked language modeling is good when you want good representations of the data with which it was trained.
Next sentence prediction, or rather causal language modeling (such as GPT), are better when you want to focus in generation.

The course has a section in how to fine-tune a masked language model that could be interesting to you: Main NLP tasks - Hugging Face Course.

Topic		Replies	Views
Fill-mask and classification at the same time Beginners	4	801	March 18, 2022
Is masking still used when finetuning a BERT model? Beginners	1	1321	July 29, 2020
MLM vs CLM, can be exchanged? Models	0	1052	August 21, 2022
Learning rate for further pretraining BERT on masked language modeling task 🤗Transformers	0	205	September 16, 2021
Continue pre-training Greek BERT with domain specific dataset 🤗Transformers	10	4656	January 4, 2023

Which model can use to pre-train a BERT model?

Related topics