Questions on the `BertModelLMHeadModel`

Hello,
Sorry I have some additional question. This question is about the BertForMaskedLM model.
The documentation for BertForMaskedLM provides the following example to illustrate the model’s usage:

>>> from transformers import BertTokenizer, BertForMaskedLM
>>> import torch

>>> tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
>>> model = BertForMaskedLM.from_pretrained('bert-base-uncased', return_dict=True)
>>> input_ids = tokenizer("Hello, my dog is cute", return_tensors="pt")["input_ids"]

>>> outputs = model(input_ids, labels=input_ids)
>>> loss = outputs.loss
>>> prediction_logits = outputs.logits

In the example above, I don’t see any [MASK] token in the input; can the BertForMaskedLM model really be used with an input string that does not include [MASK] token? If I provide BertForMaskedLM model an input string that does not include the [MASK] token, from which token will the output of the model be produced from? In this case, would BertForMaskedLM automatically insert [MASK] token in the beginning of the input sequence?

Thank you again,