How to set generation parameters for transformers.pipeline?

vivi0 · July 31, 2023, 2:42pm

I can’t figure out the correct way to update the config/ generation config parameters for transformers.pipeline (temperature etc, max_new_tokens, torch_dtype and device_map)

from transformers import pipeline
pipe = pipeline(
   'text-generation',
    model = hf_model_id,
    temperature = 0.1,
    max_new_tokens=30,
    torch_dtype='auto',
    device_map="auto")
pipe(prompt)

So if I just use the arguments in pipeline, I get UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation)

But I try:

from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig, pipeline
hf_model_id = "eachadea/vicuna-7b-1.1"
model = AutoModelForCausalLM.from_pretrained(hf_model_id)
tokenizer = AutoTokenizer.from_pretrained(hf_model_id, legacy=False)
generation_config, unused_kwargs = GenerationConfig.from_pretrained(
    hf_model_id, max_new_tokens=200, temperature=0.1, return_unused_kwargs=True
)
model.generation_config = generation_config
pipe = pipeline(
    "text-generation", 
    model=model, 
    tokenizer=tokenizer,

)
pipe(prompt)

And it doesn’t usually throw the warning, but it uses the default config instead of whatever I put in model.generation_config (for example, instead of 200 max_new_tokens, it just uses 20 max_tokens)

nielsr · July 31, 2023, 2:59pm

cc @joaogante

Ryan30 · August 1, 2023, 8:36am

I have the same question

Ryan30 · August 1, 2023, 8:37am

I have the same question.

Borell · August 4, 2023, 8:37pm

Hi vivio,

Try passing config parameters directly to pipe as kwargs or as:
pipe(prompt, max_new_tokens=200, temperature=0.1)

I believe it should work, look at the example for TextGenerationPipeline at
“Pipelines”
where “do_samples=True” is being passed to the pipeline instance.

The GenerationConfig.from_pretrained works with a model instance created with AutoModelForCausalLM. Then you can:
model.generate(**inputs, generation_config=generation_config)
With inputs being the prompt but being tokenized first.

Let me know if it worked.

Best,
Borell

Topic		Replies	Views
PretrainedConfig example to use it in GPT2 text-generation pipeline 🤗Transformers	1	589	February 6, 2021
Change Generation Config of Transformer Model without getting UserWarning 🤗Transformers	0	140	July 31, 2024
Customizing generation config for Trainer's training loop evaluation 🤗Transformers	1	2100	January 31, 2024
Batched generation_config/kwargs for the `transformers.generation.utils.generate` function 🤗Transformers	0	193	September 28, 2023
Generation Config for ByT5 🤗Transformers	0	776	May 20, 2023

How to set generation parameters for transformers.pipeline?

Related topics