Tutorial: Fine-tuning with custom datasets – sentiment, NER, and question answering

stefan-jo · August 19, 2020, 9:53am

I have two (very basic) questions:

I suppose in the tutorial the entire model is being fine-tuned at once. Is there an easy way to first train only the classification head and only then unfreeze the entire model?
Is the classification head in BertForSequenceClassification pre-trained or initialized randomly on top of BertModel? If pre-trained, which task/dataset has been used for pre-training?

Note: I’ve been using BERT instead of DistilBERT, but I guess the same applies to both.

Topic		Replies	Views
Chapter 3 questions Course	154	10946	December 7, 2025
Bert with Ner using python Beginners	0	161	November 2, 2023
Chapter 7 questions Course	121	10697	October 22, 2025
Doccano dataset for named entity recognition task using BERT Beginners	3	537	May 14, 2024
Overall accuracy in Finetuning dslim/bert-base-NER with custom dataset and labels gets only up to ~0.15 using seqeval 🤗Transformers	2	524	May 1, 2023