Keep NSP head after BertForPretraining

I am pretraining with MLM + NSP (BertForPretraining), and finetuning with NSP (BertForNextSentencePrediction).

Is there an elegant way which I can keep the NSP head from the pretrained model?

Thanks!

Validated that using BertForNextSentencePrediction does indeed keep the weights of NSP head of BertForPreTraining