Hey there! I have a question regarding the differences between loading a multilingual BERT model from pretrained weights and from a pretrained Config:
Shouldn’t the two models defined below have the same weights?
from transformers import BertConfig, BertModel
mbert_model_1 = BertModel.from_pretrained("bert-base-multilingual-uncased")
mbert_config = BertConfig.from_pretrained("bert-base-multilingual-uncased")
mbert_model_2 = BertModel(mbert_config)
I have checked and they have the same architecture, but the layer weights (and the results obtained when using them) are different.
Sorry if it’s a well-known question but I had never loaded models from Configs and I’ve found this discrepancy. (I’ve looked for a previous question related to this topic but I haven’t found any).
Thanks for your help!