Questions on the `BertModelLMHeadModel`

h56cho · October 2, 2020, 8:27pm

Hello,
Sorry I have some additional question. This question is about the BertForMaskedLM model.
The documentation for BertForMaskedLM provides the following example to illustrate the model’s usage:

>>> from transformers import BertTokenizer, BertForMaskedLM
>>> import torch

>>> tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
>>> model = BertForMaskedLM.from_pretrained('bert-base-uncased', return_dict=True)
>>> input_ids = tokenizer("Hello, my dog is cute", return_tensors="pt")["input_ids"]

>>> outputs = model(input_ids, labels=input_ids)
>>> loss = outputs.loss
>>> prediction_logits = outputs.logits

In the example above, I don’t see any [MASK] token in the input; can the BertForMaskedLM model really be used with an input string that does not include [MASK] token? If I provide BertForMaskedLM model an input string that does not include the [MASK] token, from which token will the output of the model be produced from? In this case, would BertForMaskedLM automatically insert [MASK] token in the beginning of the input sequence?

Thank you again,

Topic		Replies	Views
Use BertLMHeadModel to finetunning a language model 🤗Transformers	0	328	March 30, 2021
Use of "input_ids,token_type_ids and lm_labels" in BERT Language model 🤗Transformers	1	1060	September 20, 2020
Fine-tune BERT for Masked Language Modeling 🤗Transformers	3	3051	January 25, 2021
ELECTRA for Causal LM 🤗Transformers	0	500	April 8, 2021
Where in the code does masking of tokens happen when pretraining BERT Beginners	5	7335	August 17, 2020

Questions on the `BertModelLMHeadModel`

Related topics