Huggingface sequence classification unfreezing layers

nielsr · March 23, 2022, 8:21am

That’s because by default, we are training all parameters of a model. Hence, we will compute gradients for all parameters, and update them using gradient descent.

Does param.requires_grad == True mean that particular layer is freezed? I am confused with wording requires_grad . Does it mean freezed?

No, requires_grad=True means that a parameter will get updated if you start training. To freeze, you don’t want to have gradients, hence you would need to set requires_grad to False.

if i want to train some of the previous layers as shown here , should I use below code?

When fine-tuning language models such as BERT, RoBERTa, LongFormer, etc., we typically update all layers. However, recent research showed that this is actually not necessary, you can get similar results just by fine-tuning the biases (!) of the layers.

considering it takes a lot of time to train, is there specific recommendation regarding layers that I should train?

The default is just training all layers, hence you don’t need to set requires_grad to False anywhere.

Do i need add any additional layers such as dropout or is that already taken care by LongformerForSequenceClassification.from_pretrained ? I am not seeing any dropout layers in the above output and that’s why asking the question

This model already includes dropout. It’s not shown when printing the layers as dropout doesn’t have any trainable parameters (no weights or biases). You can see that it’s already included here. As can be seen, the classifier that is placed on top of the base LongFormer model is a LongformerClassificationHead, which includes a dropout layer.

Topic		Replies	Views
Gradual Unfreezing support for Fine tuning models 🤗Transformers	3	4052	August 26, 2020
How to freeze layers while fine-tuning? 🤗Transformers	2	589	May 16, 2025
Trainer() and required_grad=false 🤗Transformers	1	295	January 18, 2024
Freeze Lower Layers with Auto Classification Model 🤗Transformers	6	18698	May 25, 2023
How to correctly freeze some of the Wav2Vec2-Bert's layers? 🤗Transformers	0	128	July 24, 2024

Huggingface sequence classification unfreezing layers

Related topics