Multi-label sequence labeling (for e.g., multi-label NER)

salokr · November 21, 2022, 2:51pm

I am trying to create a multi-label model for sequence labeling (e.g., multi-label NER) where each token in the input can have multiple labels. For example, given the sentence “George Washington University” George can have two labels, “B-Per” and “B-Loc”.

So far, I have implemented my model using:

import transformers
from transformers import BertTokenizerFast, BertConfig, BertForTokenClassification
import torch
from transformers.modeling_outputs import SequenceClassifierOutput
from transformers import BertConfig, BertModel
class seq2SeqBERT(torch.nn.Module):
	def __init__(self):
		super(seq2SeqBERT, self).__init__()
		configuration = BertConfig()
		self.bert = BertModel(configuration)
		self.classifier = torch.nn.Linear(768, 5)
		self.criterion = torch.nn.BCEWithLogitsLoss()
	def forward(self, input_ids, attention_mask, labels = None):
		embeddings = self.bert(input_ids = input_ids, attention_mask = attention_mask)
		logits = self.classifier(embeddings['last_hidden_state'])
		loss_ = None
		flat_outputs = logits[labels!=-100]
		flat_labels  = labels[ labels!=-100]
		if labels is not None:
			loss_ = self.criterion(flat_outputs, flat_labels)
		return SequenceClassifierOutput(loss = loss_, logits = logits, attentions=embeddings.attentions)

My input sequence is something like
Input: [word1, word2, …, wordN]
output: [[0,0,0,0,0], [1,0,1,0,1] … [0,1,1,0,0]], i.e., each word is associated with multiple outputs.

I can see that the loss is going down during training, but when I try to infer using the following:

outputs = model(input)
loss = outputs['loss']
logits = outputs['logits']
predictions = torch.sigmoid(logits)

I am getting predictions as [1,1,1,1,1] for each word (i.e., the model is predicting all the classes for all the words).

Can someone please guide me toward the correct implementation of the model? Or at least some suggestions would be very helpful.

Topic		Replies	Views
Multiclass vs Multilabel Beginners	1	2611	August 11, 2020
How do I make SeqClassifier that accepts multiple sequences? Beginners	1	25	December 18, 2024
Multiple sequences per sample 🤗Transformers	1	793	February 17, 2021
Multi-Label Model - Labels per part of sentence Beginners	1	303	March 9, 2023
BERT Multilabel - Different Training Dataset For Each Label? Intermediate	3	1305	December 27, 2021

Multi-label sequence labeling (for e.g., multi-label NER)

Related topics