How to get the index of the masked token after passing the sentence to the model

abdallah197 · September 7, 2020, 5:44pm

Assuming that I am using one of the BERT instances for a language modeling task

from transformers import BertForMaskedLM, BertTokenizerFast
import torch
model = BertForMaskedLM.from_pretrained('bert-base-uncased')
tokenizer = BertTokenizerFast.from_pretrained('bert-base-uncased')

now let’s say that I have the next sentence

text = "Jeremy Bentham was the founder of modern utilitarianism"
inputs  =  tokenizer(text,  return_tensors="pt", add_special_tokens = True, truncation=True,  max_length=64)

If I say to mask a random token from some part of the sentence, for this example let’s put an arbitrary example of the 7th token

inputs['input_ids'][0][7] = tokenizer.mask_token_id

and then pass this to the model as

outputs = model(inputs)
scores = outputs[0]

If I want the embedding (vector) of only the masked token, how should I access it?

sgugger · September 8, 2020, 2:27pm

Your scores have a shape [batch_size, seq_len, vocab_size] so scores[0][7] (the indices where you masked) should have the predictions for the masked token.

abdallah197 · September 8, 2020, 2:50pm

@sgugger In this example, I already know the index because it was chosen arbitrarily before passing the input to the model. I am asking about cases where I don’t know the exact index of the masked token and I only receive the input with a masked random token.

What I had in mind is to search for the masked token in the input before getting passed to the model and save it in order to access the masked token prediction later. But this will take linear runtime for each input example.

sgugger · September 8, 2020, 2:52pm

If you don’t save the places you randomly masked, you have no other choices though. Getting the location of the masked token will be quick in any case, compared to going through the model, as long as you use pytorch functions for it.

Topic		Replies	Views
Where in the code does masking of tokens happen when pretraining BERT Beginners	5	7268	August 17, 2020
Unexpected result from transformer model prediction Beginners	0	288	November 21, 2021
Multiple Mask Tokens 🤗Transformers	4	7484	February 12, 2022
Bert attention mask question 🤗Transformers	4	1203	March 11, 2024
Batched BertForMaskedLM inference loss issue Intermediate	0	690	February 23, 2022

How to get the index of the masked token after passing the sentence to the model

Related topics