"What’s the Difference Between max_length and max_new_tokens?"

Jhondoe12 · September 5, 2024, 12:21pm

You’re encountering a warning in your code:
UserWarning: Neither max_length nor max_new_tokens has been set, max_length will default to 20 (generation_config.max_length). Controlling max_length via the config is deprecated and max_length will be removed from the config in v5 of Transformers – we recommend using max_new_tokens to control the maximum length of the generation.
To configure this properly, you should specify the max_new_tokens parameter in the model.generate() method. This allows you to control the maximum number of new tokens generated by the model.

Here’s how you can modify your code:
outputs = model.generate(input_ids, max_new_tokens=100) # Set to desired value
In this case, you can adjust max_new_tokens to your preferred limit, similar to how the OpenAI playground defaults to 256 tokens, even though the model can support up to 4000 tokens.

Summary of Changes:

Replace max_length with max_new_tokens in your generate method.
Adjust max_new_tokens to your desired output length based on your use case.

This should resolve the warning and give you more control over the output length.

Topic		Replies	Views
Confused about max_length and max_new_tokens 🤗Transformers	7	36079	September 5, 2024
Pass tokenizer or model arguments Inference Endpoints on the Hub	0	857	October 17, 2022
The current text generation call will exceed the model's predefined maximum length 🤗Transformers	1	2440	April 16, 2025
How do I increase max_new_tokens Beginners	3	29208	August 19, 2023
Both `max_new_tokens` and `max_length` have been set but they serve the same purpose 🤗Transformers	0	1632	February 2, 2023

"What’s the Difference Between max_length and max_new_tokens?"

Summary of Changes:

Related topics