Correct way to implement custom model on top of pretrained bert?

Lassassin · November 19, 2022, 9:31am

Me and my team (Beginners) are doing an ML project with BERT huggingface where we are carrying out binary classification on sentences based on an attribute of the sentence. This is our code for our own custom model based upon the
neuralspace-reverie/indic-transformers-bn-bert pretrained model. We are confused if the ordering of dropout and activation function matters in the class, and if overall the implementation of custom hidden layers are correct. We intended to add 2 dense layers (l1 and l2) with tanh activation function and dropout 0.1. The number of nodes for these layers are 512 and 256 respectively. We used softmax as output layer.

class MyTaskSpecificCustomModel(nn.Module):
    def __init__(self, checkpoint, num_labels):
        super(MyTaskSpecificCustomModel, self).__init__()
        self.num_labels = num_labels
        
        self.model = AutoModel.from_pretrained(checkpoint, config = AutoConfig.from_pretrained(checkpoint,output_attention = True,output_hidden_state = True))
        
        # This is to freeze the weights of the pretrained model.
        for _ , param in self.model.named_parameters():
            param.requires_grad=False
            
        # New Layer
        self.dropout1 = nn.Dropout(0.1)
        # self.classifier = nn.Linear(768, num_labels )
        #layer 1
        self.dropout2 = nn.Dropout(0.1)
        self.activation1 = nn.Tanh()
        self.l1 = nn.Linear(768, 512)

        #layer 2
        self.dropout3 = nn.Dropout(0.1)
        self.l2 = nn.Linear(512, 256)
        self.activation2 = nn.Tanh()

        #layer 3
        self.l3 = nn.Linear(256, num_labels)
        # self.activation3 = nn.Tanh()
        self.softmax = nn.LogSoftmax(dim=1)
        
    def forward(self, input_ids = None, attention_mask=None, Type = None):
        outputs = self.model(input_ids = input_ids, attention_mask = attention_mask)
        last_hidden_state = outputs[0]       
        sequence_outputs = self.dropout1(last_hidden_state)
        
        # logits = self.classifier(sequence_outputs[:, 0, : ].view(-1, 768))
        #layer 1
        logits = self.l1(sequence_outputs[:, 0, : ].view(-1, 768))
        logits = self.dropout2(logits)
        logits = self.activation1(logits)

        #layer 2
        logits = self.l2(logits)
        logits = self.dropout3(logits)
        logits = self.activation2(logits)

        #output layer
        logits = self.l3(logits)
        logits = self.softmax(logits)

        return logits

Topic		Replies	Views
Resources for using custom models with trainer Beginners	6	5361	April 6, 2021
Understanding how to implement custom BERT model Beginners	0	505	November 22, 2021
Are we able to create a custom class with extra layers and dropout when using trainer api? Beginners	0	392	July 17, 2022
How to add RNN layer on top of Huggingface BERT model 🤗Transformers	3	4518	January 19, 2021
Create custom head? I think that's what I need to use custom features Beginners	1	1736	April 18, 2024

Correct way to implement custom model on top of pretrained bert?

Related topics