Adding linear layer to transformer model (+ save_pretrained and load_pretrained)

waelabid · March 9, 2022, 9:46pm

I want to extend a transformer model (let’s take bert, electra, etc for example) with a linear layer and initialize the linear layer with the same initializer as the transformer model. I also want save_pretrained and load_pretrained to work smoothly (i.e. save and load the model WITH the linear layer, not separately). This is what I’m currently doing, but it’s not working

class ExtendedTransformer(PreTrainedModel):
    base_model_prefix = "model"
    def __init__(self, config):
        super().__init__(config)
        self.model = AutoModel.from_pretrained('bert-base-uncased')
        self.linear = torch.nn.Linear(self.config.hidden_size, 128, bias=False)
        self.model.init_weights()

Any advice on how to do this properly?
Thanks

BramVanroy · March 10, 2022, 8:37am

If you just want to increase the output dimensions, you can simply use

model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=128)

But here’s an explanation of what I think the issue is with your code.

If I’m not mistaken, your code it only initializes the model (which is already being initialized by the from_pretrained call). As far as I know, you should sub-class specific architectures, because PreTrainedModel itself does not have an init_weights method and models may implement a different scheme for that. Since the end of last year, it’s probably best to use post_init instead of init_weights, too. Maybe you can achieve this for a variety of models with AutoModel as well, but I am not sure.

So, the following should work. It is basically the same as BertForSequenceClassification.

from transformers import BertPreTrainedModel, AutoConfig, AutoModel
import torch

class ExtendedTransformer(BertPreTrainedModel):
    def __init__(self, config):
        super().__init__(config)
        self.bert = AutoModel.from_pretrained('bert-base-uncased')
        self.linear = torch.nn.Linear(self.config.hidden_size, 128, bias=False)
        self.post_init()

if __name__ == '__main__':
    config = AutoConfig.from_pretrained('bert-base-uncased')
    inst = ExtendedTransformer(config)

Topic		Replies	Views
Subclassing a pretrained model for a new objective 🤗Transformers	8	3550	August 10, 2022
Elegant way to load and save a pretrained model as part of other model? 🤗Transformers	0	853	June 9, 2022
Loading and save different models types in 1 class 🤗Transformers	0	363	June 8, 2022
What does from_pretrained do? Beginners	2	540	September 10, 2024
Custom class for token classification 🤗Transformers	1	37	August 30, 2024

Adding linear layer to transformer model (+ save_pretrained and load_pretrained)

Related topics