I am trying to finetune a set of T5 models and it is going well.
However, I have noticed a âmax_lengthâ parameter showing up in the config parameters in W&B. I am using this example script for summarization.
I thought that during training the model keeps predicting tokens autoregressively until the eos token gets generated. If the model only predicts a maximum of 20 tokens during training it might explain why my validation loss is lower than my train loss (I am using generation_max_length = 80 for validation metrics).
I have not found any info about this parameter in the documentation or where it comes from.
If anyone is familiar with the parameter, I would appreciate an explanation of it.
Thanks in advance
After some further research, it seems like the parameter comes from PretrainedConfig in configuration_utils.py
I am still not sure if this parameter is used during training or what effect it has.
1 Like
Hi navjordj, i also use T5 & wandb, and modified my max_length to another parameter. Also for me max_length=20 keeps appearing and I dont know why. Did you find out anything else so far?
Hello tsei902! 
I discovered that the parameter originates from configuration_utils.py, which is inherited by the T5 configuration.
Upon examining the source code, I couldnât find any instances where this parameter is utilized. My best guess is that if generation_max_length is not provided, it defaults to max_length.
Please let me know if you find out of the parameter is used or has a effect.
Yes, I found that now too. It seems that parameter is T5 default and doesnt get overwritten in wandb, even though I pass another max_length value during generation. E.g. the default parameter of max_length=20 is only taken if no other value is given.
hi @tsei902, where do I have to override this parameter,