Is it possible to train a new head?

I have two data sets, A and B. They both consist of sentences and named entity labels.

I’ve fine tuned distilbert for named entity recognition on data set A.

I would like to retain the encoding layer of this model, but remove the head that performs the named entity recognition.

I would then like to attach a new head for named entity recognition, and fine tune this on data set B.

So ultimately the head is trained only on data set B, but the encoding layer is trained on both data set A and B.

Is this possible? How can this be done?