How to mask multiple tokens in BartForConditionalGeneration?


How can I use BartForConditionalGeneration to predict multiple tokens instead of one? Or is the text filling task not (yet) supported?

I cant remember their name, but someone wrote this code allowing one to mask several tokens. If you replace Roberta with Bart, I expect it’ll work.

from transformers import RobertaTokenizer, RobertaForMaskedLM

import torch

tokenizer = RobertaTokenizer.from_pretrained('roberta-base')

model = RobertaForMaskedLM.from_pretrained('roberta-base')

sentence = """My name is <mask> and I enjoy <mask>."""

token_ids = tokenizer.encode(sentence, return_tensors='pt')

# print(token_ids)

token_ids_tk = tokenizer.tokenize(sentence, return_tensors='pt')


masked_position = (token_ids.squeeze() == tokenizer.mask_token_id).nonzero()

masked_pos = [mask.item() for mask in masked_position ]

print (masked_pos)

with torch.no_grad():

    output = model(token_ids)

last_hidden_state = output[0].squeeze()

print ("\n\n")

print ("sentence : ",sentence)

print ("\n")

list_of_list =[]

for mask_index in masked_pos:

    mask_hidden_state = last_hidden_state[mask_index]

    idx = torch.topk(mask_hidden_state, k=100, dim=0)[1]

    words = [tokenizer.decode(i.item()).strip() for i in idx]


    print (words)


best_guess = ""

for j in list_of_list:

    best_guess = best_guess+" "+j[0]

Thank you so much :slight_smile: this works!!

1 Like

I remember seeing this example for RobertaForMaskedLM, but I didn’t expect that it would be able to fill multiple tokens, or spans of tokens, for one mask (which should be part of the Bart training objective). @Kwiebes1995, did you find that this could infill spans, or does it just generate single tokens?