Hi,
When we are fine-tuning a pretrained model, using the fine-tuning method that HF provides with the Trainer API, what actually happens behind the scenes?
I am struggling to find any resources out there that describes how the Trainer API carries out fine-tuning, whether it freezes some layers, adds a new layer, or train the last few layers etc.
Here is my current understanding of this process.
Typically, when you fine-tune with HF, you load a tokenizer
and a downstream task model
. The model
are the downstream task head that is added to the tokenizer
. When you “train” with the Trainer, it keeps the tokenizer frozen (from my understanding) and you are just updating weights for the downstream task model.
Is this correct? can someone recommend some reading resources.