Tutorial: Fine-tuning with custom datasets – sentiment, NER, and question answering

For 1, you can look in the training tutorial where there is an example in PyTorch.
For 2, the head is initialized randomly since we are using a checkpoint of the base model, it would be pretrained if we used a checkpoint that has been fine-tuned for sequence classification like distilbert-base-uncased-finetuned-sst-2-english.

1 Like