How to apply decoding method and penalty

I’m trying to apply some parameters in my code below, anyone know how to apply them? Is it possible in google/flan-t5-xl model?

parameters I want to apply:

{
    "decoding_method": "greedy",
    "max_new_tokens": 5,
    "repetition_penalty": 1
}

Code from google/flan-t5-xl model:

from transformers import T5Tokenizer, T5ForConditionalGeneration
tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-xl", decoding_method="greedy",)
model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-xl")
input_ids = tokenizer(prompt_input, return_tensors="pt", ).input_ids
outputs = model.generate(input_ids, max_new_tokens=5)
print(f"Sentimental: {tokenizer.decode(outputs[0], skip_special_tokens=True)}")

Hi,

Sure. For decoding, the generate method can be used, and it uses greedy decoding by default, so that’s ok. You can pass the additional arguments as keyword arguments:

from transformers import T5Tokenizer, T5ForConditionalGeneration
tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-xl", decoding_method="greedy",)
model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-xl")
input_ids = tokenizer(prompt_input, return_tensors="pt", ).input_ids

generation_kwargs = {"max_new_tokens": 5, "repetition_penalty": 1}
outputs = model.generate(input_ids, **generation_kwargs)
print(f"Sentimental: {tokenizer.decode(outputs[0], skip_special_tokens=True)}")