Hi,
I am using a finetuned BLIP to do image captioning. I want to lookup a specific token’s score in the vocab set. I know BLIP generates captions by taking the highest score. I want to find the score for a specific token that may or may not be generated in the caption. BLIP uses the BertTokenizer.
for idx, batch in enumerate(val_dataloader):
filenames = batch.pop(‘filename’)
pixel_values = batch[‘pixel_values’].to(device)
generated_ids = model.generate(pixel_values=pixel_values, max_length=500, output_scores=True, return_dict_in_generate=True)
for fn in filenames:
# iterate over each position's vocab set
for vocab in generated_ids.scores:
print("vocab: ", vocab)
Here vocab is a tensor with scores for each token in the vocab set. However, there is no corresponding token id for me to be able to look up a specific token’s score.
vocab
tensor([[ 1.3996, -1.0718, -1.0718, …, -1.0718, -1.0719, -1.0719]],
device=‘cuda:0’)
Any help would be appreciated!
Thanks,
Nina