Finetuning a specific task when pretrained model isn't trained on that specific task? Using the task model vs using the base model

I want to fine tune a RobertaForSequenceClassification task on microsoft/codebert-base model. This microsoft/codebert-base model hasn’t been trained for Sequence-Classification task.
Can I load this pre-trained model inside a SequenceClassification function and fine tune it on my dataset?

model = RobertaForSequenceClassification.from_pretrained( "microsoft/codebert-base" )

Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at microsoft/codebert-base and are newly initialized: ['classifier.dense.weight', 'classifier.out_proj.weight', 'classifier.out_proj.bias', 'classifier.dense.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

While loading I get this message which is expected as the model isn’t trained on the task and thus would not have the weights.

Can I proceed with fine-tuning this RobertaForSequenceClassification model or would I need to define my own classifier layer on top of RobertaModel and train that?

Hi @mayanksatnalika
Yes, you can load a pre-trained base model for SequenceClassification.
RobertaForSequenceClassification adds the classification head itself, you won’t need to do that manually. So you can fine-tune it for classification.

1 Like

Thank you @valhalla for the quick reply.

Just trying to understand it, I came across this here a classifier is added on top of base model and trained rather than directly using a …forSequenceClassification task, would it be possible to tell me how the 2 approaches differ or any relevant links on it :sweat_smile:

ForSequenceClassification models does pretty much the same. It takes a base model and adds pooler and classfier head on top of it, so you won’t have to do it manually. You can see how it’s done here, its’ pretty easy to follow

1 Like

Thanks a lot :slight_smile: