Do_sample defaults to False when I'm trying to set to True

Hi,

I am trying to run inference on the model lmsys/vicuna-7b-v1.5. I load the model and set up the pipeline/prompting as follows:

tokenizer = AutoTokenizer.from_pretrained(model_name)

  model_pipeline = transformers.pipeline(
  "text-generation",
  model=model_name,
  tokenizer = tokenizer,
  torch_dtype=torch.float16,
  max_new_tokens = max_new_toks,
  device_map="auto"
  )

  # There are other traditional LLM inference settings that can be modified here
  sequences = model_pipeline(
          input_prompt,
          do_sample=True
  )

Even though I set do_sample as True, I get numerous warnings that say:

UserWarning: `do_sample` is set to `False`. However, `top_p` is set to `0.6` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_p`. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed.

Am I doing something incorrectly when setting do_sample? When the model is loaded, does it set all inference parameters to those that are in the model’s associated generation_config.json file? How would you recommend resolving this? Thank you!

Hi,

Thanks for reporting, that looks like a bug. It might have to do with the generation config on the hub: generation_config.json · lmsys/vicuna-7b-v1.5 at main.

cc @joaogante

Indeed, the warning is printed when loading the generation config from the hub, so can be avoided by modifying config directly on the hub.

The generation in this code snippet is done with sampling, as indicated by do_sample=True

Thanks for the responses. How would I be able to modify the generation config on the hub? It also doesn’t look like the generation config on the hub has any mentions of do_sample?

I think the only way to open a PR in the hub asking the owners to add do_sample=True in the generation config file. Currently it doesn’t have any mention of do_sample and the default value is False in that case

@anniedoris I think the problem might be how you’re passing the keyword argument to the pipeline. Instead of setting do_sample=True, try { "do_sample" : True }