How to change the linear classifier?

hi, i’m using HuggingFace for multi-label classification. i am curious if there is a way to change/customize the classifier head on top? and second question is, the default type of classifier is linear. i heard of classifiers like SVM, decision tree etc. can anyone explain their connection? between linear and those ones?

You can extend a pretrained model with your own layers as much as you want. Something like this can work:

class BertCustomClassification(BertPreTrainedModel):
    def __init__(self, config):
        super().__init__(config)
        self.num_labels = config.num_labels

        self.bert = BertModel(config)
        # Add classifier layers below
        self.dropout = nn.Dropout(config.hidden_dropout_prob)
        self.preclassifier = nn.Linear(config.hidden_size, config.hidden_size)
        self.act = nn.GELU()
        self.classifier = nn.Linear(config.hidden_size, config.num_labels)

        # don't forget to init the weights for the new layers
        self.init_weights()

You should then also change the forward pass, of course.

Your question about other ML architectures like SVMs and decision trees is too broad for this forum and fall outside of the scope of HuggingFace Transformers. You can ask such question on a website like Stack Overflow (but search first because this question has been asked a billion times).

1 Like

Thanks, but i don’t know how should i use this and where should i call it. i don’t know what kind of classifiers can i use instead of default linear classifier and how to implement? this is a part of my code:

max_length = 50
tokenizer = BartTokenizer.from_pretrained('facebook/bart-large', do_lower_case=True)
encodings = tokenizer.batch_encode_plus(comments,max_length=max_length,padding='max_length', truncation=True)
train_inputs = encodings['input_ids']
train_masks = encodings['attention_mask']
batch_size = 48
train_data = TensorDataset(train_inputs, train_masks, train_labels)
train_sampler = RandomSampler(train_data)
train_dataloader = DataLoader(train_data, sampler=train_sampler, batch_size=batch_size)
model = BartForSequenceClassification.from_pretrained('facebook/bart-large', num_labels=num_labels)
model.cuda()

and after this, i go for the train part.

@BramVanroy I am attempting to do something similar. Can you provide some feedback about Extending a pretrained model ?