I would like to ask the usage of _tied_weights_keys. I noticed that there is no “cls.predictions.decoder.weight” in BertForMaskedLM and it should be tied with input embeddings.
However, I try to check it with the following codes:
from transformers import AutoModel
from transformers.modeling_utils import find_tied_parameters
model = AutoModel.from_pretrained("Bert-base-uncased")
find_tied_parameters(models)
There are no parameters. So I would like to know how to tie the input embedding with decoder outputs?