I would like to have multiple [MASK] tokens in a sentence but I get an error when I try to run it.
What do I need to change to fix it?
Instead of: text = "The capital of France, " + tokenizer.mask_token + “, contains the Eiffel Tower.”
I need: text = "The capital of France, " + tokenizer.mask_token + ", contains the Eiffel + tokenizer.mask_token "
from transformers import BertTokenizer, BertForMaskedLM
from torch.nn import functional as F
import torch
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForMaskedLM.from_pretrained('bert-base-uncased', return_dict = True)
text = "The capital of France, " + tokenizer.mask_token + ", contains the Eiffel Tower."
input = tokenizer.encode_plus(text, return_tensors = "pt")
mask_index = torch.where(input["input_ids"][0] == tokenizer.mask_token_id)
output = model(**input)
logits = output.logits
softmax = F.softmax(logits, dim = -1)
mask_word = softmax[0, mask_index, :]
top_10 = torch.topk(mask_word, 10, dim = 1)[1][0]
for token in top_10:
word = tokenizer.decode([token])
new_sentence = text.replace(tokenizer.mask_token, word)
print(new_sentence)
I’ve used the code from here
I’ve already looked at Multiple Mask Tokens but I want the output to be a sentence.
I hope you can help me
Kind regards
Linda