Is there a way to give the generate function a list of different parameters for let’s say max_new_tokens, min_new_tokens and temperature if the num_return_sequences is 10?
min_new_tokens_list = [int(paragraph_length * (0.3 + i * 0.02)) for i in range(num_return_sequences)]
max_new_tokens_list = [int(paragraph_length * (0.6 + i * 0.075)) for i in range(num_return_sequences)]
temperature_list = [0.8 if i % 2 == 0 else 0.9 for i in range(num_return_sequences)]
beam_outputs = model.generate(
input_ids,
min_new_tokens=min_new_tokens_list,
max_new_tokens=max_new_tokens_list,
temperature=temperature_list,
num_return_sequences=num_return_sequences,
num_beams=20,
no_repeat_ngram_size=3,
bad_words_ids=filter_token_ids,
pad_token_id=tokenizer.eos_token_id,
attention_mask=(input_ids != tokenizer.pad_token_id).to("cuda:0"),
do_sample=True,
top_p=0.95,
)
Using a loop to accomplish this, increases the inference time by 10x.