Dynamically change max_new_tokens in a pipeline

victorlacerdab · July 5, 2024, 8:59pm

Is there anyway to change the max_new_tokens used in a pipeline without having to load the whole pipeline again?

I am using gemma27b-it and just playing around with the model, starting with a base prompt and then stringing the model’s answers with new prompts and so on. The only thing is that right now the max_new_tokens variable is static, and I want to control it every time I call the pipeline object such that for a question like ‘Paris is the capital of France: T or F?’ I want to set max_new_tokens = 1 and for more elaborate questions, to raise this limit. Is this possible?

GPT007 · July 7, 2024, 8:03am

Specify max_new_tokens when you call the pipeline.

Ex:
Q = “Paris is the capital of France: T or F?”
T_or_F = pipeline(Q, max_new_tokens=1)
Elaborated = pipeline(f“Elaborate the response to ‘{Q}’: {T_or_F}.”, max_new_tokens=1024)

Topic		Replies	Views
How to set 'max_length' properly when using pipeline? 🤗Transformers	4	1594	November 18, 2024
Limit max # of tokens for inference in pipeline? Beginners	0	1080	April 7, 2023
Output token lengths of smaller models 🤗Transformers	0	499	October 30, 2023
How does the text-generation pipeline know the special stop token? Beginners	8	3205	June 10, 2024
How do I increase max_new_tokens Beginners	3	29253	August 19, 2023

Dynamically change max_new_tokens in a pipeline

Related topics