BERT: AttributeError: 'RobertaForMaskedLM' object has no attribute 'bert'

anon58275033 · July 14, 2021, 8:12pm

I am trying to freeze some layers of my masked language model using the following code:

for param in model.bert.parameters():
    param.requires_grad = False

However, when I execute the code above, I get this error:

AttributeError: 'RobertaForMaskedLM' object has no attribute 'bert'

In my code, I have the following imports for my masked language model, but I am unsure what is causing the error above:

from transformers import AutoModelForMaskedLM
model = AutoModelForMaskedLM.from_pretrained(model_checkpoint)

So far, I have tried to replace bert with model in my code, but that did not work.

Any help would be good.

Thanks.

sgugger · July 15, 2021, 12:01pm

The name of the body of the model roberta for Roberta models, not bert. So you should loop on for param in model.roberta.parameters(). In general, the attribute that is model agnostic is base_model, so for param in model.base_model.parameters() should work anywhere.

anon58275033 · July 15, 2021, 2:20pm

Okay, thanks for that. Now, is it possible to freeze just the top layer or bottom layer of the BERT model?

Yuti · July 17, 2021, 6:45pm

Yes, you can use the names_parameters() method for that.
It gives you the names of the parmeters along with the paramters themselves,
so you can filter only the paramters of the top / bottom layer based on their names and freeze them.

anon58275033 · July 19, 2021, 1:52pm

Hello, thanks for the reply. So, do I just simply add this code?

names_parameters()

Yuti · July 19, 2021, 3:29pm

No, you should iterate over them.
Replcae the following line: for param in model.roberta.parameters():
with: for name, param in model.roberta.named_parameters():.
Then filter the parameters you want to freeze using the name variable.

You may print the names to see how they look like, and then come up with a condition that will filter the ones you need.

Hope it helps .

anon58275033 · July 19, 2021, 4:54pm

Hello, thanks very much for that explanation and solution - it has really helped me quite a lot. In terms of filtering parameter, say my dataset has emojis, can I filter emojis? Like, for example, this emoji?

Yuti · July 20, 2021, 10:11am

Amm could you please elaborate more? I’m not sure I understood your question.
BTW If it is an unrelated question to the original question of this post, you may ask it in a new one.

anon58275033 · July 21, 2021, 3:58pm

Apologies, basically I am trying to do masked language modelling using emojis, but when I deploy my model, the predicted tokens only show words, not emojis; hence, I think that the emojis are not frequent enough in the vocabulary, causing them to be less likely for the masked prediction. Therefore, I was wondering if I could freeze some layers of my BERT model to get the less frequent tokens, which are emojis, to be the top predictions when I deploy my masked language model.

anon58275033 · July 25, 2021, 11:57am

@Yuti - How would I go about filtering characters ?

Yuti · July 25, 2021, 12:24pm

Have you tried increasing the masking probabilty for emojis?
If you have controll over the masking when training your model, you can increase the probability an emoji will be masked and then the model will output higher probabilities for emojis.

I’m not sure this is a good solution but you may try it.

anon58275033 · July 25, 2021, 5:03pm

@Yuti In my code, I have this line which allows me to adjust the masked language probability:

from transformers import DataCollatorForLanguageModeling
data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm_probability=0.15)

However, if I adjusted it, I am unsure how to do it for the emojis.

Yuti · July 25, 2021, 6:12pm

Changing the mlm_probability argument wont give you the result you need,
but I think you can create a sub class of DataCollatorForLanguageModeling that does the emoji masking.

You can find the source code for DataCollatorForLanguageModeling here.

Topic		Replies	Views
How to freeze some layers of BertModel Beginners	8	17526	August 25, 2022
How to freeze layers using trainer? Beginners	11	31892	May 26, 2024
How to freeze layers while fine-tuning? 🤗Transformers	2	145	May 16, 2025
How to use `.modules()` command to get all the parameters that pertains to the uppermost layer of `roberta-large` model? 🤗Transformers	1	4099	August 10, 2020
Gradual Unfreezing support for Fine tuning models 🤗Transformers	3	3930	August 26, 2020

BERT: AttributeError: 'RobertaForMaskedLM' object has no attribute 'bert'

Related topics