Mullti Label Text Classification

ayush-adeptmind · June 22, 2023, 10:03pm

I have a dataset of roughly 44k data points and 1500 labels. I want to use “AutoModelForSequenceClassification” for classifying the labels. I am able to pass in “multi_label_classification” in the problem type. However, the F1 score and accuracy score is quite poor. I suspect it’s because the data is sparse and 0 labels are preferrred over 1 labels (because there are less number of 1s for all the categories).

Although this is a broad problem, In particular I was wondering how I can pass in the “pos_weight” parameter to “BCEWithLogitsLoss” function that the model uses during training through the TrainerAPI or Training Arguments?

I saw the source code and it looks like it doesn’t accept parameters.

rvienne · June 23, 2023, 7:45am

Hello @ayush-adeptmind,
From my understanding of your question you want to introduce a weighted loss into your multilabel classification problem.
For such a problem, the best way to do I’m aware of while using transformers library is to override the Trainer class (create a new trainer class which inherits from the actual Trainer) and override (by defining it again) the “compute_loss” method.
When instanciating BCEWithLogitsLoss, you should be able to insert a weight vector.

See How can I use class_weights when training? for more precise information and code snippets.

Have a good day!

ayush-adeptmind · June 26, 2023, 11:27am

Adding exact code for my implementation based on the source provided. Thanks!

from typing import Optional
from torch import FloatTensor
from torch.nn import BCEWithLogitsLoss
import logging

class WeightedTrainer(Trainer):
    def __init__(self, *args, class_weights: Optional[FloatTensor] = None, **kwargs):
        super().__init__(*args, **kwargs)
        if class_weights is not None:
            class_weights = class_weights.to(self.args.device)
            logging.info(f"Using multi-label classification with class weights", class_weights)
        self.loss_fct = BCEWithLogitsLoss(pos_weight=class_weights)

    def compute_loss(self, model, inputs, return_outputs=False):
        """
        How the loss is computed by Trainer. By default, all models return the loss in the first element.
        Subclass and override for custom behavior.
        """
        labels = inputs.pop("labels")
        outputs = model(**inputs)
        try:
            loss = self.loss_fct(outputs.logits.view(-1, model.num_labels), labels.view(-1, model.num_labels))
        except AttributeError:  # DataParallel
            loss = self.loss_fct(outputs.logits.view(-1, model.module.num_labels), labels.view(-1, model.num_labels))

        return (loss, outputs) if return_outputs else loss

trainer = WeightedTrainer(
    model,
    args,
    train_dataset=ds_enc["train"],
    eval_dataset=ds_enc["test"],
    tokenizer=tokenizer,
    compute_metrics=compute_metrics,
    class_weights=weights
)

Topic		Replies	Views
Multi label classification with large number of labels and sparse data 🤗Transformers	1	1523	July 15, 2023
Multi-label token classification 🤗Transformers	34	7685	September 6, 2023
Custom BCEWithLogitsLoss for Sequence Classification using Auto Model Beginners	1	19	May 26, 2025
The Best Approach for Weighted Multilabel Classification 🤗Transformers	1	69	January 24, 2025
Loss function used in run_mmimdb.py Beginners	0	280	December 28, 2020

Mullti Label Text Classification

Related topics