Max_length parameter in T5

navjordj · March 9, 2023, 6:22pm

I am trying to finetune a set of T5 models and it is going well.

However, I have noticed a “max_length” parameter showing up in the config parameters in W&B. I am using this example script for summarization.

I thought that during training the model keeps predicting tokens autoregressively until the eos token gets generated. If the model only predicts a maximum of 20 tokens during training it might explain why my validation loss is lower than my train loss (I am using generation_max_length = 80 for validation metrics).

I have not found any info about this parameter in the documentation or where it comes from.

If anyone is familiar with the parameter, I would appreciate an explanation of it.

Thanks in advance

navjordj · March 9, 2023, 7:29pm

After some further research, it seems like the parameter comes from PretrainedConfig in configuration_utils.py

I am still not sure if this parameter is used during training or what effect it has.

tsei902 · March 22, 2023, 10:44am

Hi navjordj, i also use T5 & wandb, and modified my max_length to another parameter. Also for me max_length=20 keeps appearing and I dont know why. Did you find out anything else so far?

navjordj · March 22, 2023, 11:24am

Hello tsei902!

I discovered that the parameter originates from configuration_utils.py, which is inherited by the T5 configuration.

Upon examining the source code, I couldn’t find any instances where this parameter is utilized. My best guess is that if generation_max_length is not provided, it defaults to max_length.

Please let me know if you find out of the parameter is used or has a effect.

tsei902 · March 22, 2023, 1:22pm

Yes, I found that now too. It seems that parameter is T5 default and doesnt get overwritten in wandb, even though I pass another max_length value during generation. E.g. the default parameter of max_length=20 is only taken if no other value is given.

kartiksrma · September 4, 2024, 3:40am

hi @tsei902, where do I have to override this parameter,

Topic		Replies	Views
Confused about max_length and max_new_tokens 🤗Transformers	7	36715	September 5, 2024
Max_new_tokens warning for Flan-T5 fine-tuning Models	3	944	March 9, 2025
Does generate's max_length influence training? 🤗Transformers	0	103	April 25, 2024
Pipeline max_length 🤗Transformers	2	3986	February 23, 2024
Model max length not set. Default value 🤗Transformers	1	641	October 6, 2024

Max_length parameter in T5

Related topics