I am using BART model to make a chatbot. I am doing the following to generate response from the model:
generated_output = model.generate(
input_ids = input_ids,
attention_mask = mask,
output_scores=True, return_dict_in_generate=True
)
But, the sequence generated from the ‘sequences’ field:
gen_tokens = generated_output['sequences']
gen_tokens_seq = [tokenizer.decode(g, skip_special_tokens = True) for g in gen_tokens]
And the one generated from argmax(scores)
num_generated_tokens = len(generated_output['scores'])
for i1 in range(0, num_generated_tokens, 1):
temptensor = generated_output['scores'][i1][0]
gen_id = torch.argmax(temptensor).item()
gen_ids.append(gen_id)
gen_ids = torch.tensor(gen_ids)
gen_ids = gen_ids.view(1, -1)
gen_ids_seq = [tokenizer.decode(g, skip_special_tokens = True) for g in gen_ids]
are not the same. I need the logit vector for the generated sequence of tokens by model.generate(), but it’s not returning what I expect it to return. What can I do to get the logit values for the “sequence” returned by model.generate()?