Does changing the classification head of models always require fine-tuning?

I’m using twitter-xlm-roberta-base-sentiment for sentiment analysis, but instead of predicting 3 labels (negative, neutral, positive), I need to predict sentiment as represented by a single continuous value (e.g., raning from -1 for negative to 1 for positive). Hence, I’m trying to change the classification head of the model.

I’ve been experimenting with adding custom classification heads as discussed in these previous posts (here and here).

As a sanity check, I replicated the XLMRobertaClassificationHead and attached it to the model. However, as the weights and biases of the original classification head are gone, the predictions are of course meaningless.

This leads me to my question: When changing the classification head, do I ALWAYS need to retrain the model, or can I keep the weights and biases from the previous head and use them for my modified classification head?


In this case a simple manual transformation could suffice. Something like -1 * prob_negative + 1 * prob_positive