Looks like the pull request is here: https://github.com/huggingface/transformers/pull/14654 and is implemented in transformers v4.16.0
Can you please explain the scores returned in generate
in details. In particular, when we use a batch_size > 1.
Why applying argmax()
on scores
does not give the same thing as in sequences
?
With batch_size > 1
, why the scores shape is not (batch_size, beam_nums, vocab_len) instead of (batch_size*beam_nums, vocab_len). It is really so confused.
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("t5-small")
pad_index = tokenizer.convert_tokens_to_ids(tokenizer.pad_token)
unk_index = tokenizer.convert_tokens_to_ids(tokenizer.unk_token)
eos_index = tokenizer.convert_tokens_to_ids(tokenizer.eos_token)
model = AutoModelForSeq2SeqLM.from_pretrained("t5-small")
model.resize_token_embeddings(len(tokenizer))
model.to("cuda")
# sequences
seq1 = "summarize: I am confused! I am confused"
seq2 = "summarize: why generate does not work with batch_size >1"
# encoding input and attention mask
encoding = tokenizer(
[seq1, seq2],
padding="longest",
max_length=128,
truncation=True,
return_tensors="pt",
)
input_ids, attention_mask = encoding.input_ids.to("cuda"), encoding.attention_mask.to("cuda")
output = model.generate(input_ids,
max_length=64,
early_stopping=False, # to get len(scores) = sequences max_length
num_beams=4,
do_sample=False,
output_scores=True,
no_repeat_ngram_size=4,
return_dict_in_generate=True,
num_return_sequences=1)
output.sequences
tokenizer.batch_decode(output.sequences, skip_special_tokens=True)
# output.sequences
output.sequences
# tensor([[ 0, 27, 183, 11319, 55, 1, 0, 0, 0, 0,
# 0, 0, 0, 0],
# [ 0, 3806, 405, 59, 161, 28, 11587, 834, 7991, 2490,
# 536, 3, 5, 1]], device='cuda:0')
# How to get the above indices using output.scores ??
Could you elaborate on how you chose gen_probs.prod(-1)
as your method of obtaining an unique probability per sequence? Why not use gen_probs.mean(-1)
for the average probability score per sequence?
Hey everyone
We have released a new function to solve this problem, have a look at this thread: [Announcement] Generation: Get probabilities for generated output
Since some of the snippets at the start of this thread no longer match our API, Iâd like to ask for new questions/comments to be posted on the thread Iâve linked above.
(@danielcabal you might find an answer to your question in the post I linked )
Hey @Keverino, were you able to solve this?