Append a linear layer on top of the vanilla Electra model

bennicholl · April 26, 2021, 5:42pm

I’m using the ElectraModel.from_pretrained(‘google/electra-base-discriminator’) to train a multi label classification task. I would like to add a linear layer for the final hidden state logits. Below is a pseudocode example

model = ElectraModel.from_pretrained('google/electra-base-discriminator')
logit_layer = torch.nn.Linear(768, 4)
### below code is what I'm trying to figure out 
append_logit_layer = model.append(logit_layer)

My main reason for needing to append this is so when I back propagate with torch SGD
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.8)
The model.parameters() value will have my logit_layer gradient.

lewtun · April 27, 2021, 1:01pm

hey @bennicholl, one idea would be to subclass ElectraModel, add your logit_layer to the init and then override the forward method so the logit_layer is fed the outputs from the encoder (see source code).

there might be more elegant approaches, but i’m pretty sure this one should work

Topic		Replies	Views
Loading Lower Layers of Model 🤗Transformers	1	2145	December 16, 2020
Adding linear layer to transformer model (+ save_pretrained and load_pretrained) 🤗Transformers	1	3703	March 10, 2022
How do I add new separate layers to a pretained model to add a modality? Beginners	0	1154	July 9, 2023
AttributeError: 'ElectraForPreTrainingOutput' object has no attribute 'last_hidden_state' Beginners	0	915	August 14, 2022
Adding new layer to T5encoder 🤗Transformers	0	228	September 15, 2023

Append a linear layer on top of the vanilla Electra model

Related topics