Pre-training BERT

rishabhstha · February 10, 2023, 6:39am

I pre-trained BERT from scratch on a domain-specific custom dataset using BertForPreTraining on both MLM and NSP objectives. I trained my own custom tokenizer. I trained up to 8 epochs because the loss started to remain pretty consistent.

My loss was around 2.3. Is that normal for pre-training BERT?

Another question, I also pre-trained another model starting from the bert-base-uncased model. I also used my custom tokenizer, would that be fine since the vocab list is different? The loss is decreasing and doing almost similar as previous model. Is my model really training?

santosale · May 21, 2024, 5:17pm

Sorry could you share the code?

Topic		Replies	Views
Pre-training a BERT model from scratch with custom tokenizer Intermediate	5	3095	January 11, 2022
Training BERT model from scratch with custom sequence Beginners	0	394	September 21, 2022
Fine-tuning BERT Model on domain specific language and for classification 🤗Transformers	7	8426	November 14, 2024
Pre-Train BERT (from scratch) Research	43	18992	June 27, 2022
Continual pre-training vs. Fine-tuning a language model with MLM 🤗Transformers	5	8684	November 30, 2021

Pre-training BERT

Related topics