Finetuning from multiclass to mutlilabel

marlon89 · September 1, 2021, 8:19am

Hello,

I got a specific classification task where I finetuned a pretrained BERT Model on a specific task concerning customer reviews (classify a text as “customer service text”, “user experience” etc.):

num_labels_cla = 8
model_name_cla = "bert-base-german-dbmdz-uncased"
batch_size_cla = 32

model = AutoModelForSequenceClassification.from_pretrained(model_name_cla, num_labels=num_labels_cla)
tokenizer = AutoTokenizer.from_pretrained(model_name_cla)

As you can see I got 8 distinct classes. My finetuned classification model scores pretty well with unseen data with a f1 score of 80.1. However, it is possible that one text belongs to 2 different classes. My question now is how I have to change my code achieve that? I already transformed my target variable with MultiLabelBinarizer such that my target variable looks like this:

       [0, 0, 0, 1, 0, 0, 0, 0],
       [1, 0, 0, 0, 0, 0, 0, 0],
       [1, 0, 0, 1, 0, 0, 0, 0],

I am using the HuggingFace Trainer instance for finetuning.

Cheers

nielsr · September 1, 2021, 1:18pm

You can set the problem_type of an xxxForSequenceClassification model to multi_label_classification when instantiating it, like so:

from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained("bert-base-german-dbmz-uncased", problem_type="multi_label_classification", num_labels=num_labels_cla)

This ensures that the BCEWithLogitsLoss is used instead of the CrossEntropyLoss, which is necessary for multi-label classification. You can then fine-tune just like you would do with multi-class classification.

marlon89 · September 1, 2021, 1:46pm

Thank you for your reply! Is it correct then that I encode my target like this:

nielsr · September 1, 2021, 1:53pm

Yes as you can see in the code, the loss is computed as follows:

elif self.config.problem_type == "multi_label_classification":
      loss_fct = BCEWithLogitsLoss()
      loss = loss_fct(logits, labels)

The logits will be of shape (batch_size, num_labels). The docs of PyTorch’s BCEWithLogitsLoss indicates that the labels should have the same shape. So that’s indeed correct.

marlon89 · September 1, 2021, 1:54pm

Thank you so much for your support!

Topic		Replies	Views
Multiclass vs Multilabel Beginners	1	2623	August 11, 2020
Multilabel text classification Trainer API Beginners	8	22474	August 2, 2023
Multi-class Classification Basics Beginners	4	4634	August 24, 2021
Multi-label token classification 🤗Transformers	34	7728	September 6, 2023
TFBertForSeqClassification for multilabel classification 🤗Transformers	0	882	July 18, 2022

Finetuning from multiclass to mutlilabel

Related topics