Finetuning on MLM task

sujoyde · June 29, 2021, 12:55pm

What is the best way to finetune a hugging face model on MLM task on my own dataset which is not that much (10K sentences) with regards to the freezing and unfreezing of layers.

Lets say the hugging face model has 12 encoders and 12 decoders and I want to train it on my own dataset using MLM technique. Will it better if I train only 1 encoder and 1 decoder so that my dataset’s information gets incorporated into the model and also the model does not forget what it previously learned or do I train all the encoders and decoders again for the retraining purposes?

It would be very helpful if you can give some intuition into this.

Topic		Replies	Views
LM fine-tuning on unlabelled dataset Beginners	0	442	April 10, 2021
Finetune molformer model Models	2	69	March 25, 2025
Can we directly use the embeddings from masked language models? 🤗Transformers	0	745	December 15, 2021
BERT MLM model fine-tune on small data bad results Beginners	0	97	April 14, 2024
LM finetuning on domain specific unlabelled data Beginners	6	4662	April 21, 2021

Finetuning on MLM task

Related topics