Max_new_tokens warning for Flan-T5 fine-tuning

ishaaniyer · May 2, 2024, 11:37am

I am fine-tuning a Google/Flan-T5-base model but while training, getting this warning:
“UserWarning: Using the model-agnostic default max_length (=20) to control the generation length. We recommend setting max_new_tokens to control the maximum length of the generation”
The Training code is as follows:
L_RATE = 3e-4
BATCH_SIZE = 32
PER_DEVICE_EVAL_BATCH = 32
WEIGHT_DECAY = 0.01
SAVE_TOTAL_LIM = 3
NUM_EPOCHS = 2
training_args = Seq2SeqTrainingArguments(
output_dir=“./results”,
evaluation_strategy=“epoch”,
learning_rate=L_RATE,
per_device_train_batch_size=BATCH_SIZE,
per_device_eval_batch_size=PER_DEVICE_EVAL_BATCH,
weight_decay=WEIGHT_DECAY,
save_total_limit=SAVE_TOTAL_LIM,
num_train_epochs=NUM_EPOCHS,
predict_with_generate=True,
push_to_hub=False
)

trainer = Seq2SeqTrainer(
model=model,
args=training_args,
train_dataset=tokenized_dataset[“train”],
eval_dataset=tokenized_dataset[“test”],
tokenizer=tokenizer,
data_collator=data_collator,
compute_metrics=compute_metrics
)

I cannot find where should I specify the max_new_tokens as it does not work when specified in training_args : Seq2SeqTrainingArguments.init() got an unexpected keyword argument ‘max_new_tokens’

RaushanTurganbay · May 3, 2024, 8:13am

Hi! AFAIK the only way to set generation parameters for the trainer is directly through the “generation_config” of the model, before passing it to “trainer”

model.generation_config.max_new_tokens = 50 # generate exactly 50 tokens when predicting
model.generation_config.min_new_tokens = 50
trainer = Seq2SeqTrainer(model=model, ...)

ishaaniyer · May 6, 2024, 1:26pm

Thank you very much for the response!

thistlillo · March 9, 2025, 8:13am

I have spent days looking for this information. The documentation should be improved.

Topic		Replies	Views
Confused about max_length and max_new_tokens 🤗Transformers	7	36166	September 5, 2024
Minimum number of tokens in generate Models	0	1064	March 10, 2023
Flan-T5 - Finetuning to a Longer Sequence Length (512 -> 2048 tokens): Will it work? Beginners	3	4195	January 9, 2024
Max_length parameter in T5 🤗Transformers	5	1225	September 4, 2024
"What’s the Difference Between max_length and max_new_tokens?" 🤗Transformers	0	610	September 5, 2024

Max_new_tokens warning for Flan-T5 fine-tuning

Related topics