Clarify BERT model learnable parameters


I want to use HateBERT from the paper’s repository, which is a BERT model that its pre-training was extended in abusive language.

In order to do that, I created a BERT model (bert-base-uncased in PyTorch) and tried to load HateBERT’s weights with load_state_dict() (after having made minor changes in parameters names, to match BERT’s).

load_state_dict() throws the error:

RuntimeError: Error(s) in loading state_dict for BertModel:
	Missing key(s) in state_dict: "embeddings.position_ids".

which means that the BERT model requires embeddings.position_ids. I checked this tensor and it is not a PyTorch parameters tensor, just a tensor. From the error message, it is also evident that all other parameters match, otherwise their names would also be mentioned. Can someone explain?

HateBERT is available on the hub. GroNLP/hateBERT · Hugging Face

from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("GroNLP/hateBERT")

model = AutoModelForMaskedLM.from_pretrained("GroNLP/hateBERT")

Thank you @BramVanroy, and sorry for my late response.

Is there a way to know that the Hugging Face library has that model, and do not make the OP in the first place? The question is to avoid future posts, and make a better search in advance.

Sure, you can look for a model on the hub.

Thank you @BramVanroy!

1 Like