I try to use pipeline, and want to set the maximal length for both tokenizer and the generation process. However, if I try:
prompt = 'What is the answer of 1 + 1?'
pipe = pipeline(
"text-generation",
tokenizer=tokenizer,
model=model,
do_sample=True,
truncation=True,
padding='max_length',
num_return_sequences=2,
temperature=1.0,
num_beams=1,
max_length=1024,
)
messages = [
{"role": "user", "content": prompt},
]
ret = pipe(messages)
I will get an error message:
ValueError: Input length of input_ids is 1024, but `max_length` is set to 1024. This can lead to unexpected behavior. You should consider increasing `max_length` or, better yet, setting `max_new_tokens`.
Therefore, I set max_new_tokens
according to the error message as follow:
prompt = 'What is the answer of 1 + 1?'
pipe = pipeline(
"text-generation",
tokenizer=tokenizer,
model=model,
do_sample=True,
truncation=True,
padding='max_length',
num_return_sequences=2,
temperature=1.0,
num_beams=1,
max_length=1024,
max_new_tokens=512,
)
messages = [
{"role": "user", "content": prompt},
]
ret = pipe(messages)
However, I will get a warning all the time:
Both `max_new_tokens` (=512) and `max_length`(=1024) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)`.
I wonder what would be the correct way to set max_length
parameter for both tokenizer and the momdel.generate(…)? Should I use alternative arguments in pipeline?