Can you fine tune fine-tuned models?

onturenio · March 13, 2023, 8:25am

Hi, I’m new in the community. I’m reading the course and I’m a bit confused with the concept of fine tuning, model heads, and so on in transformers. I’m more familiar with convnets, so I’ll try to draw an analogy to emphasise what I don’t quite grasp.

In convnets, to fine tune a model for image classification for instance you remove the last layer of the model, put a dense at the end with some softmax functions, freeze the rest of the model, and train it again with smaller learning rate. What’s going on when you do the same to a transformer? What is exactly the “head”? Do you also freeze the rest of the model? Because I didn’t see that movement in the course.

I’ll try to be more concrete. I’m thinking of this model Recognai/selectra_medium · Hugging Face, which is an Electra model fine-tuned for Spanish. Then you have Recognai/zeroshot_selectra_medium · Hugging Face, which is fine tuned of the former for zero-shot classification. Don’t you remove the head for fine tuning? Doesn’t this get you back to the regular non-fine tuned model? Can I fine tune the zero-shot classification? What head do I remove then, and how does this not break the latter fine tuning?

Thanks a lot in advance!

dmnapolitano · March 31, 2023, 8:39pm

Hi @onturenio were you able to find any answers to your questions? I have the same ones!

Thanks,
Diane

onturenio · April 12, 2023, 6:38am

Unfortunately not yet. I’m still learning but I cannot fully answer these questions. I’ll come back to this post if I ever end up understanding the issue

adityashukzy · April 12, 2023, 8:10am

Hey!

I have a similar question: training an LLM for more than 5 epochs at a time is increasingly difficult on Kaggle or Colab.

One strategy I came up with is to train for 3 epochs at a time, and sequentially repeat this training for 5 times. Each time, we begin by fine-tuning the model from the previous training session. I would say that the underlying concept of “fine-tuning fine-tuned models” is the same as what you’re asking?

Any thoughts?

mcr1974 · September 12, 2023, 1:31pm

Any news on this? All the tutorials start from e.g. base llama2 for fine_tuning. Can I start from a model that’s already been finetuned on top of a base llama2?

Topic		Replies	Views
Fine tunning for zero shot on tensorflow Beginners	0	261	September 16, 2022
After fine tuning, saving and reloading the model, he is "forgetting" fine tuning 🤗Transformers	0	801	August 9, 2023
Fine tune a finetuned model Beginners	1	563	December 16, 2024
Fine-tuning: Under the hood Intermediate	0	422	July 11, 2023
Fine tuning the existing fine tuned model Beginners	1	950	July 18, 2024

Can you fine tune fine-tuned models?

Related topics