Fine-tuning BERT Model on domain specific language

tillfurger · January 4, 2021, 7:35pm

Hi everyone

I want to further fine-tune a BERT Model on domain specific language as done in https://arxiv.org/pdf/1903.10676.pdf or https://arxiv.org/abs/1908.10063. If I understood correctly, I have to use the same vocabulary as the original pre-trained model or have to train it from scratch. Since I don’t want to train the model form scratch I have to accept the fact that I have to use the same vocab. My first fine-tuning step is to adapt the model to the domain specific language, where I feed the model some (unlabeled) domain specific text (large dataset) for it to get familiar with the language (freezing some layers during training to prevent forgetting of the pre-trained corpus). Secondly, I want to further fine-tune it for sentiment classification giving the model labeled data (smaller dataset) to train on.

Can anyone help me on how to do that (both steps)? Thank you very much in advance.

tillfurger · January 5, 2021, 6:17pm

hey has anyone an idea?

Topic		Replies	Views
Fine-tuning BERT Model on domain specific language and for classification 🤗Transformers	7	8438	November 14, 2024
Fine-tune model for domain or create language model from scratch Beginners	0	658	May 2, 2022
How to deal with of new vocabulary? Beginners	1	547	November 3, 2021
Pretraining Models from Scratch vs Further Training 🤗Transformers	0	269	November 28, 2023
Fine tunning Spanish BERT model Beginners	6	727	February 3, 2021

Fine-tuning BERT Model on domain specific language

Related topics