Hi all
I was wondering if I can ask you some questions about how to use .generate()
for BART or other pretrained models. The example code is,
from transformers import BartTokenizer, BartForConditionalGeneration, BartConfig
path = 'facebook/bartlarge'
model = BartForConditionalGeneration.from_pretrained(path)
tokenizer = BartTokenizer.from_pretrained(path)
ARTICLE_TO_SUMMARIZE = "My friends are cool but they eat too many carbs."
inputs = tokenizer([ARTICLE_TO_SUMMARIZE], max_length=1024, return_tensors='pt')
# Generate Summary
summary_ids = model.generate(
inputs['input_ids'],
num_beams=4, num_return_sequences=2, max_length=5, early_stopping=True,
output_scores=True, return_dict_in_generate=True,
)
print(summary_ids.keys())
print(summary_ids['sequences'])
print(summary_ids['sequences_scores'])
print(len(summary_ids['scores'][0]))
print(summary_ids['scores'][0].size())
print([tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False)
for g in summary_ids['sequences']])
Then, the output is,
odict_keys(['sequences', 'sequences_scores', 'scores'])
tensor([[ 2, 2387, 2387, 964, 2],
[ 2, 2387, 4, 4, 2]])
tensor([0.8599, 0.9924])
4
torch.Size([4, 50265])
['MyMy friends', 'My..']
Do not worry about poor performance, [‘MyMy friends’, ‘My…’], since I am only trying to understand how this works. So, the question is,

return_dict_in_generate=True
returns['sequences']
, but together withoutput_scores=True
, it returns['sequences', 'sequences_scores', 'scores']
. There are other arguments, likeoutput_attentions
oroutput_hidden_states
. BART BartForConditionalGeneration documents do not explain anything about.generate()
. So, I searched further and found Utilities for Generation (Utilities for Generation — transformers 4.5.0.dev0 documentation) that seems to talk about generating outputs using.generate()
and Huggingface transformers model that seems to talk about the general methods of base classes, PreTrainedModel, but there is no document that shows what each variable, [‘sequences’, ‘sequences_scores’, ‘scores’], actually work or how they are computed. Where is the documents for this?  Is
sequences_scores
computed as, \sum_{t} \log p(y_{t}  x, y_{t<})?  How do you get
sequences_scores
fromscores
? My initial guess was to apply softmax onscores
indim=1
, then gettopk
withk=1
, but this does not give me very weird answer.
import torch
sm = torch.nn.functional.softmax(summary_ids['scores'][0], dim=1)
topk = sm.topk(k=1, dim=1)
print(sm)
print(topk)
print(summary_ids['sequences'][0])
which comes out as
tensor([[1.2851e04, 8.8341e12, 2.4085e06, ..., 3.9426e12, 2.8815e12,
1.0564e08],
[1.9899e05, 1.9899e05, 1.9899e05, ..., 1.9899e05, 1.9899e05,
1.9899e05],
[1.9899e05, 1.9899e05, 1.9899e05, ..., 1.9899e05, 1.9899e05,
1.9899e05],
[1.9899e05, 1.9899e05, 1.9899e05, ..., 1.9899e05, 1.9899e05,
1.9899e05]])
torch.return_types.topk(
values=tensor([[9.9271e01],
[1.9899e05],
[1.9899e05],
[1.9899e05]]),
indices=tensor([[2387],
[ 0],
[ 0],
[ 0]]))
tensor([ 2, 2387, 2387, 964, 2])
First token 2387 appears to be correct, but from the second, the probability is 1.9899e05, which is just equivalent to 1/len(tokenizer). This seems to me that all the tokens are likely to be generated equally. So, How do you get sequences_scores
from scores
?
4. How do I get the probability of all the conditional probability of output tokens? For example, if .generate()
gives output as [I, am, student], then how do I get the conditional probability of each token? [Pr(I  x), Pr(am  x, I), Pr(student  x, I, am)]. Initially, I thought it was ‘scores’, but I am not sure now.
5. Since I find it difficult to find documents/information on .generate()
nor any information above, is this something that experienced researchers in NLP or programming would just be able to guess?
Thank you in advance