How to do unsupervised fine-tuning?

I have a custom text dataset, which I want BERT to get acquainted with. My final goal is not to run any supervised task (it is actually to act as a starting point to get sentence embeddings from S-BERT.

I just want to continue doing the unsupervised training on my dataset. How do I do this?

So far, I have come across two possible candidates in the documentation for this:

  1. BertForPreTraining (the self-explanatory name led me to this)
  2. BERTForMaskedLM (as used in this blog post).

Can both of them be used for this purpose? Is one more attuned to my purpose? Have you previously tried to do something like this? Any additional suggestions would also be very helpful.

Thank you :slight_smile:

BertForPreTraining has two heads, one for masked language modeling and one for next sentence prediction task. This class should be used when you want to pre-train the bert as described in the paper i.e MLM + NSP

BERTForMaskedLM, is for MLM training which can be used for pre-training.