What happens if the fine-tuning is done twice?

kintaro · September 14, 2020, 5:21pm

Apologies in advance if the question is silly, I’m trying to learn about huggingface and nlp in general.

My doubt is the following: let’s suppose that I want to do text-generation and I will work with the gtp2 pre-trained model. First of all I do finetuning with an astronomy dataset. I save the model as gtp2-astronomy. Then, I finetuned gtp2-astronomy with a physics dataset and saved it as a final-model.

My question is: will this final-model be good for text generation of astronomy and also for physics? Or by fine-tuning the second time, do I “eliminate” the ability of the model with astronomy subjects?

I ask this question because, as I understand it, when finetuning you are basically working with the last layer of the network, so, I don’t know if fine-tuning the second time will reset the last layer, which the first time learned about astronomy.

rgwatwormhill · December 7, 2020, 3:00am

Apologies if the answer is silly, I’ve been using BERT and not GPT2.

I think your twice-trained model would probably remember at least some of the astronomy training, as well as the physics training.

If you had a really huge corpus of physics texts it might overwrite your astronomy training, but I think it is unlikely. Some researchers have shown that many transformers models have a lot more capacity than they need. Also, there is probably overlap of physics and astronomy vocab.

When you fine-tune, you can define whether you want the model layers to be altered or frozen. You could consider gradual unfreezing of layers.

BramVanroy · December 7, 2020, 8:17am

It all depends on how much data you have and on how long you train. If you have billions of texts and train for billions of steps, the astronomy part will hardly be remembered. What you can do, however, is merging them and shuffling the data so that the model evenly sees both datasets at the same time.

PS it is not necessarily the case that only the last layer is finetuned. This is something that you as a user can decide.

Topic		Replies	Views
GPT-2 fine-tuning Beginners	0	1616	June 12, 2023
GPT2 finetuning for text generation is getting overfitted Beginners	0	1109	August 27, 2021
Fine tuning the existing fine tuned model Beginners	1	952	July 18, 2024
Fine tuning the already fine tuned model 🤗Transformers	0	454	June 13, 2023
How can I continue to train my fine-tuned model with new datasets? 🤗Transformers	0	373	September 14, 2023

What happens if the fine-tuning is done twice?

Related topics