Hi all,
I’m Darshan Hiranandani, working on a SeqClassifier model based on BERT, and I need to modify it so that it can handle multiple sequences at once in a batch. Here’s the current structure of the model:
class SeqClassifier(nn.Module):
def init(self, n_classes):
super(SeqClassifier, self).init()
self.bert = BertModel.from_pretrained(‘bert-base-uncased’)
self.drop = nn.Dropout(p=0.3)
self.out = nn.Linear(self.bert.config.hidden_size, n_classes)
def forward(self, input_ids, attention_mask):
out = self.bert(input_ids, attention_mask=attention_mask)
pooled_output = out.pooler_output
# [1, 768] but I want it to be [num_seq, 768]
output = self.out(self.drop(pooled_output))
return output
Currently, this works fine for a single sequence. However, I want to process multiple sequences at once (e.g., in a batch). The issue is that the pooler_output
only gives one vector for the [CLS]
token, even though I pass multiple sequences. Ideally, I want the pooled output to be [num_seq, 768]
where num_seq
is the batch size.
Here’s how I process my inputs right now:
tokenizer = BertTokenizer.from_pretrained(‘bert-base-uncased’)
input_ids = torch.tensor([tokenized[‘input_ids’]], dtype=torch.long)
input_mask = torch.tensor([tokenized[‘attention_mask’]], dtype=torch.long)
This works for a single sentence, but I’m running into trouble when I try to pass a batch of sequences. I’ve considered prepending [CLS]
tokens to each sequence, but the BERT model still only returns one pooled output.
Has anyone here worked with multi-sequence inputs in a similar setup? How can I modify the SeqClassifier
to return one pooled output per sequence in the batch?
Any suggestions or guidance would be much appreciated!
Thanks!
Regards
Darshan Hiranandani