How do I make SeqClassifier that accepts multiple sequences?

na50r · December 18, 2024, 11:41am

So suppose I have the following code:

class SeqClassifier(nn.Module):
    def __init__(self, n_classes):
        super(SentimentClassifier, self).__init__()
        self.bert = BertModel.from_pretrained('bert-base-uncased')
        self.drop = nn.Dropout(p=0.3)
        self.out = nn.Linear(self.bert.config.hidden_size, n_classes) 
        #768, n_classes

    
    def forward(self, input_ids, attention_mask):
        out = self.bert(input_ids, attention_mask=attention_mask)
        pooled_output = out.pooler_output
        # [1, 768], I want [num_seq, 768]
        
        output = self.out(self.drop(pooled_output))
        return output

The pooled output refers to an embedding of [CLS] after linear transformation and tanh activation, it’s not equivalent to the embedding that comes out of the last hidden layer of the model. What I want to do is figure out a way that allows me to pass in multiple sentences at once, such that pooled_output becomes [num_seq, 768] and thus the classifier would be able to train on multiple outputs directly.

This is how I process my input at the moment:

tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
input_ids = torch.tensor([tokenized['input_ids']], dtype=torch.long)
input_mask = torch.tensor([tokenized['attention_mask']], dtype=torch.long)

Everything works if I only work with one sequence but as soon as I deal with more, I get confused.

One option seems to be to prepend [CLS] tokens to each new sequence, so the tokenizer will automatically add the required ids.
However, the Bert model does not return multiple pooled outputs but still only one.

Any ideas what I’m doing wrong?

na50r · December 18, 2024, 11:50am

Okay I figured it out. Unsure if I’m supposed to delete the post or can keep it. But if anyone struggles with this.

The reason why it went wrong was because of Tensor dimensions.

two_input_ids = torch.stack([input_ids, input_ids], dim=1).squeeze()
two_input_masks = torch.stack([input_mask, input_mask], dim=1).squeeze()
out = bert_model(two_input_ids, attention_mask=two_input_masks)
out.pooler_output.shape
# [2, 768], as desired

Topic		Replies	Views
Darshan Hiranandani : How to Modify SeqClassifier to Handle Multiple Sequences in Batch? Beginners	0	18	December 31, 2024
Multi-label sequence labeling (for e.g., multi-label NER) 🤗Transformers	0	1528	November 21, 2022
Multiple sequences per sample 🤗Transformers	1	793	February 17, 2021
SST2 classification with BertForSequenceClassification 🤗Transformers	0	604	August 1, 2022
New model output types 🤗Transformers	7	5727	March 11, 2021

How do I make SeqClassifier that accepts multiple sequences?

Related topics