Question on Next Sentence Prediction

Hi !
I’m trying to fine-tune a transformer model on a simultaneous MLM + NSP task. I’ve some older code examples that show that there used to be a DataCollatorForNSP class that could be relied on in this configuration. However, this class no longer exists and has been replaced by DataCollatorForLanguageModelling, as stated in this issue: Why was DataCollatorForNextSentencePrediction removed ? · Issue #9416 · huggingface/transformers · GitHub

I’m nevertheless a bit confused because the source code for the DataCollatorForLanguageModelling class shows no parameters for controlling the amount of NSP while there is a float value for how much masked words should the training involve.
I was wondering whether someone could give me a clearer picture of this class and how to involve Next Sentence Prediction as an auxiliary task during a MLM training.

Thanks a lot !

If you use TextDatasetForNextSentencePrediction for dataset, there is a parameter called nsp_probability default value is 0.5 just like in Bert.

So I believe it fits your needs :slightly_smiling_face: