Hi adrshkm,
the weights of the SequenceClassification head are initialized randomly.
See this page Fine-tune a pretrained model
which says
When we instantiate a model with
from_pretrained()
, the model configuration and pre-trained weights of the specified model are used to initialize the model. The library also includes a number of task-specific final layers or ‘heads’ whose weights are instantiated randomly when not present in the specified pre-trained model. For example, instantiating a model withBertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)
will create a BERT model instance with encoder weights copied from thebert-base-uncased
model and a randomly initialized sequence classification head on top of the encoder with an output size of 2.