Resume Training / Finetune a language model and further finetune a classifier

I would like to finetune a powerful classifier based on a pre-trained language model. As we know, the typical approach is to fine-tune a classifier using a pre-trained model. What I am wondering is that, if I fine-tune a pre-trained model based on a fine-tune language model settings using DS1(typical text dataset) (OR resume training from the last checkpoint) and then further fine-tune this newly fine-tuned model using another DS2(typical text dataset) for a classifier purpose, would this be a redundant effort as compared to a pipeline which is to just finetune a pre-trained model using DS2? I would like to receive your thoughts.

Thank you.

Hi, there are papers indeed indicate that “multi-steps” finetuning is helpful. See this paper for one example .