RoBERTa trained on NSP

prajjwal1 · January 12, 2021, 3:58am

I want to perform experiments with RoBERTa that has been trained on MLM+NSP task. In the paper, NSP was discarded because of lower performance and wasn’t made publicly available by the authors. Does anyone have good suggestions about if it is available in some form or a implementation that can replicate it in the same manner (with pre-training) ? I know transformers provide support but there’s not much room to make error due to restricted GPU access time, so if both model weights and implementation isn’t available, I’d really appreciate if someone can provide a working training routine with transformers.

Topic		Replies	Views
Pre-trained from scratch RoBERTa is not fine-tuned. (using pytorch and DDP) Beginners	1	369	September 24, 2024
Pre-Train BERT (from scratch) Research	43	18937	June 27, 2022
Reproduce RoBERTa Using Huggingface Transformers 🤗Transformers	0	241	July 28, 2023
Pretraining ALBERT Intermediate	2	1335	February 16, 2022
Difference between roberta and bert for pretraining Models	0	558	July 15, 2023

RoBERTa trained on NSP

Related topics