I am fine-tuning a Google/Flan-T5-base model but while training, getting this warning:
“UserWarning: Using the model-agnostic default max_length
(=20) to control the generation length. We recommend setting max_new_tokens
to control the maximum length of the generation”
The Training code is as follows:
L_RATE = 3e-4
BATCH_SIZE = 32
PER_DEVICE_EVAL_BATCH = 32
WEIGHT_DECAY = 0.01
SAVE_TOTAL_LIM = 3
NUM_EPOCHS = 2
training_args = Seq2SeqTrainingArguments(
output_dir=“./results”,
evaluation_strategy=“epoch”,
learning_rate=L_RATE,
per_device_train_batch_size=BATCH_SIZE,
per_device_eval_batch_size=PER_DEVICE_EVAL_BATCH,
weight_decay=WEIGHT_DECAY,
save_total_limit=SAVE_TOTAL_LIM,
num_train_epochs=NUM_EPOCHS,
predict_with_generate=True,
push_to_hub=False
)
trainer = Seq2SeqTrainer(
model=model,
args=training_args,
train_dataset=tokenized_dataset[“train”],
eval_dataset=tokenized_dataset[“test”],
tokenizer=tokenizer,
data_collator=data_collator,
compute_metrics=compute_metrics
)
I cannot find where should I specify the max_new_tokens as it does not work when specified in training_args : Seq2SeqTrainingArguments.init() got an unexpected keyword argument ‘max_new_tokens’