Darshan Hiranandani : How to Modify SeqClassifier to Handle Multiple Sequences in Batch?

Hi all,

I’m Darshan Hiranandani, working on a SeqClassifier model based on BERT, and I need to modify it so that it can handle multiple sequences at once in a batch. Here’s the current structure of the model:
class SeqClassifier(nn.Module):
def init(self, n_classes):
super(SeqClassifier, self).init()
self.bert = BertModel.from_pretrained(‘bert-base-uncased’)
self.drop = nn.Dropout(p=0.3)
self.out = nn.Linear(self.bert.config.hidden_size, n_classes)

def forward(self, input_ids, attention_mask):
    out = self.bert(input_ids, attention_mask=attention_mask)
    pooled_output = out.pooler_output
    # [1, 768] but I want it to be [num_seq, 768]
    
    output = self.out(self.drop(pooled_output))
    return output

Currently, this works fine for a single sequence. However, I want to process multiple sequences at once (e.g., in a batch). The issue is that the pooler_output only gives one vector for the [CLS] token, even though I pass multiple sequences. Ideally, I want the pooled output to be [num_seq, 768] where num_seq is the batch size.

Here’s how I process my inputs right now:
tokenizer = BertTokenizer.from_pretrained(‘bert-base-uncased’)
input_ids = torch.tensor([tokenized[‘input_ids’]], dtype=torch.long)
input_mask = torch.tensor([tokenized[‘attention_mask’]], dtype=torch.long)
This works for a single sentence, but I’m running into trouble when I try to pass a batch of sequences. I’ve considered prepending [CLS] tokens to each sequence, but the BERT model still only returns one pooled output.

Has anyone here worked with multi-sequence inputs in a similar setup? How can I modify the SeqClassifier to return one pooled output per sequence in the batch?

Any suggestions or guidance would be much appreciated!

Thanks!
Regards
Darshan Hiranandani

1 Like