Create custom head? I think that's what I need to use custom features

mkeywood · October 7, 2021, 9:41am

Loving Transformers, and so far having some success.
I am wanting to do something along the lines of sequence classification and when using AutoModelForSequenceClassification with bert-base-uncased and tokenizing and the using Trainer, all is great.
However, I want to see if I can get better results though by creating some custom features to augment the 768 outputs from bert and train on that.
So in my mind, I can use AutoModel and then use those weights and take it from there, presumably creating my own head?!?

From looking at the difference between AutoModel model and AutoModelForSequenceClassification model I see it is a dropout and then a linear layer from 768 to 2. Makes sense.

So first step I thought I would try and reproduce that and that is when I realised I didn’t understand as much as I thought I did

I thought I could do something like this (inspired by other topics on this forum):

import torch.nn as nn
from transformers import AutoModel
class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.base_model = AutoModel.from_pretrained(checkpoint)
        self.dropout = nn.Dropout(0.1)
        self.linear = nn.Linear(768, 2)
        
    def forward(self, input_ids, token_type_ids, attention_mask, labels):
        outputs = self.base_model(input_ids, token_type_ids=token_type_ids, attention_mask=attention_mask)

        outputs = self.dropout(outputs['last_hidden_state'])
        outputs = self.linear(outputs)
        
        return outputs

model = MyModel()
model.to('cuda')

But I get errors:

RuntimeError: grad can be implicitly created only for scalar outputs

And I realise i really don’t know where to look next.

Any guidance would be greatly appreciated, either on this approach, or in general how best to approach augmenting additional custom features.
For example, even if this works, how would I ‘inject’ my custom features into this model?

Thanks

raygx · April 18, 2024, 10:18pm

@mkeywood Have you found any solution? If you have please, I need the solution for this very problem.

Thanks.

Topic		Replies	Views
How to upload a modified architecture of a BERT model Models	0	237	August 25, 2023
Create a custom model that works with any pretrained transformer body Beginners	2	1235	May 16, 2025
Inference on models with custom head 🤗Optimum	1	19	January 28, 2025
Correct way to implement custom model on top of pretrained bert? Beginners	0	903	November 19, 2022
How to use AutoModel Beginners	0	1995	May 4, 2021

Create custom head? I think that's what I need to use custom features

Related topics