DistillBERT pre-training for a new text corpus

krayush07 · April 29, 2021, 3:35pm

Suppose, I want to use a DistillBERT model for a new text corpus (say social media corpus - SMC) which is different from what originally BERT is trained on. There are two ways to train DistillBERT now:

Pre-train SMC-BERT from BERT checkpoint using the SMC data. Then train distillBERT with SMC-BERT as teacher model using the SMC corpus as train/valid/text corpus.
Pre-train DistillBERT directly from BERT as teacher model and the SMC corpus as train/valid/text corpus.

Is there a suggested approach out of these 2 ways to use distill-bert for a new corpus, the new corpus being different in textual style than the original BERT corpus?

Topic		Replies	Views
Does it make sense to train DistilBERT from scratch in a new corpus Beginners	14	6630	April 4, 2023
Further pre-train language model in transformers like BERT Models	3	1108	March 27, 2022
Training DistilGPT2 Beginners	4	2422	October 13, 2020
Text classification on small dataset (8K) Intermediate	1	896	July 27, 2021
Preprocessing step for fine-tuning language model Beginners	1	848	March 12, 2021

DistillBERT pre-training for a new text corpus

Related topics