Freezing mt5 model for fine-tuning

soroushheidary · July 3, 2023, 4:57pm

Hey, I wanted to fine tune a mt5-base model for my project (machine translation) and when I try to freeze the parameters except the language head, I’ll get errors: can any one helps me know why this is the case
(also I read in the Docs that fine tuning transformers often yield better results when updating every parameter and not freezing, but I don’t have enough processing power for that, and is it even the case?)

this is how I freeze the parameters:

model = T5ForConditionalGeneration.from_pretrained("t5-small")
tokenizer = T5TokenizerFast.from_pretrained("t5-small")

for param in model.parameters():
    param.requires_grad = False
model.lm_head.requires_grad = True


training_args = TrainingArguments(
    output_dir="mt5-finetuned",
    num_train_epochs=3,
    per_device_train_batch_size=32,  ## lower batch sizes
    per_device_eval_batch_size=32,  ## lower batch sizes
    evaluation_strategy="epoch",
    learning_rate=5e-4,
    weight_decay=0.01,
    save_total_limit=3,
    # fp16=True,  ## lower precision
)


trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=dataset,#["train"],
    eval_dataset=dataset,#["validation"],
)

trainer.train()

But I get this error:

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Thank everyone in advance

akdo00001 · July 15, 2023, 4:05pm

Hello,

Were you able to solve the issue? I am also getting the same error, when I run nielsr/codet5-small-code-summarization-ruby · Hugging Face @nielsr

Topic		Replies	Views
Do you train all layers when fine-tuning T5? Beginners	7	6978	September 26, 2023
Errors when fine-tuning T5 Beginners	7	6469	January 3, 2022
How to freez a model? 🤗Transformers	1	814	December 14, 2021
How does the finetune on transformer (t5) work Beginners	3	1350	April 11, 2022
How to freeze layers while fine-tuning? 🤗Transformers	2	148	May 16, 2025

Freezing mt5 model for fine-tuning

Related topics