Confused about max_length and max_new_tokens

I’m trying to run the example code from flan-t5-small:

from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-base")
model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-base")

input_text = "translate English to German: How old are you?"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids

outputs = model.generate(input_ids)
print(tokenizer.decode(outputs[0]))

I’m getting the following warning:

UserWarning: Neither max_length nor max_new_tokens has been set, max_length will default to 20 (generation_config.max_length). Controlling max_length via the config is deprecated and max_length will be removed from the config in v5 of Transformers – we recommend using max_new_tokens to control the maximum length of the generation.

How should I configure this? Is this something like on OpenAI playground where the default setting is 256 but the model actually supports 4000 tokens?

2 Likes

hey, was this resolved?

1 Like

No, I’m waiting for a reply.

1 Like

This outputs = model.generate(input_ids,max_length= 60) worked for me without giving any error.

3 Likes

just give a generate number:
outputs = model.generate(input_ids,max_new_tokens=4000)