Setting max_length does not limit length of output

With the current Transformers library code, max_new_tokens takes precedence over max_length, so specifying max_new_tokens is the simplest approach.

1 Like