Append a linear layer on top of the vanilla Electra model

I’m using the ElectraModel.from_pretrained(‘google/electra-base-discriminator’) to train a multi label classification task. I would like to add a linear layer for the final hidden state logits. Below is a pseudocode example

model = ElectraModel.from_pretrained('google/electra-base-discriminator')
logit_layer = torch.nn.Linear(768, 4)
### below code is what I'm trying to figure out 
append_logit_layer = model.append(logit_layer)

My main reason for needing to append this is so when I back propagate with torch SGD
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.8)
The model.parameters() value will have my logit_layer gradient.

hey @bennicholl, one idea would be to subclass ElectraModel, add your logit_layer to the init and then override the forward method so the logit_layer is fed the outputs from the encoder (see source code).

there might be more elegant approaches, but i’m pretty sure this one should work