Do you train all layers when fine-tuning T5?

valhalla · September 8, 2020, 6:04am

I haven’t seen much experiments for this, but IMO it’s better to fine-tune the whole model.

Also when you pass labels argument to T5ForConditionalGeneration's forward method then it calculates the loss for you and returns it as the first value in the returned tuple .

And you can use the finetune.py script here to fine-tuning T5 and other seq2seq models

See this thread T5 Finetuning Tips

Topic		Replies	Views
Freezing mt5 model for fine-tuning Models	1	479	July 15, 2023
Fine-tuning T5 with Trainer for novel task Models	1	1152	September 1, 2021
Errors when fine-tuning T5 Beginners	7	6473	January 3, 2022
The point of using pretrained model if I don't freeze layers Beginners	1	8505	May 31, 2023
T5 Seq2Seq custom fine-tuning Models	7	3713	November 30, 2020

Do you train all layers when fine-tuning T5?

Related topics