How does the finetune on transformer (t5) work

NimaBoscarino · April 8, 2022, 11:24pm

Welcome! I’ll take a shot at answering this, but I’m not at expert at this so I may be wrong!

As far as I can understand, when you instantiate a model the weights are not frozen, so if you start finetuning on the model all parameters will be trainable. If you want to freeze weights, that’s something you’ll have to set manually, and the way that you do that will depend on what library (PyTorch, Tensorflow, or Flax) that you’re using.

When you use AutoModelForSeq2SeqLM (or any of the other AutoModelX) classes to instantiate a model with .from_pretrained, the backend that gets used is PyTorch. (As per Auto Classes) So once you’ve loaded the model (with the PyTorch backend), if you want to freeze all of the base model’s weights you can access them and freeze them with:

for param in model.base_model.parameters():
    param.requires_grad = False

If you were to use one of the Tensorflow or Flax auto-models, then you’d have to follow those libraries’ methods for freezing layers if that’s what you wanted to do.

I hope this helps!

Topic		Replies	Views
Errors when fine-tuning T5 Beginners	7	6480	January 3, 2022
I could not able to use save_pretrained on my T5 Model 🤗Transformers	3	1051	October 25, 2021
Freezing mt5 model for fine-tuning Models	1	479	July 15, 2023
Issues in finetuning t5-large model 🤗Transformers	1	457	April 25, 2023
Training loss is zero from the first step and model generation is empty after training? 🤗Transformers	0	363	February 8, 2024

How does the finetune on transformer (t5) work

Related topics