Multilabel classification performance metrics using Trainer API

RuudVelo · November 16, 2021, 8:15pm

Hello,

My goal is to output certain model performance metrics for my multilabel classification problem (I am using a DistilBERT architecture by the way). If I look at each of the labels individually you can say most of the labels are really unbalanced. Given this I also want to correct for the label (or class) imbalance.

I am fairly new to this and by looking at some examples, and trying myself I have done the following:

def accuracy_thresh(y_pred, y_true, thresh=0.5, sigmoid=True): 
    y_pred = torch.from_numpy(y_pred)
    y_true = torch.from_numpy(y_true)
    if sigmoid:
        y_pred = y_pred.sigmoid()
    return ((y_pred>thresh)==y_true.bool()).float().mean().item()

The above code calculates model accuracy given a threshold of 0.5.

Next I used this code and I also included the above function as an output

def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    y_true = labels
    y_pred = sigmoid(eval_pred.predictions) 
    y_pred = (y_pred>0.5).astype(float)
        
    clf_dict = classification_report(y_true, y_pred, target_names=all_labels,
                                         zero_division=0, output_dict=True)
    
    
    return {"accuracy_thresh": accuracy_thresh(predictions, labels), "micro f1": clf_dict['micro avg']['f1-score'], "macro f1": clf_dict['macro avg']['f1-score'],
           "weighted f1": clf_dict['weighted avg']['f1-score']}

It looks a bit hacky, but it works (it runs). I have the Trainer als follows:

class MultilabelTrainer(Trainer):
    def compute_loss(self, model, inputs, return_outputs=False):
        labels = inputs.pop("labels") #keeps the labels
        outputs = model(**inputs)
        logits = outputs.logits
        loss_fct = torch.nn.BCEWithLogitsLoss(pos_weight = class_weights)
        loss = loss_fct(logits.view(-1, self.model.config.num_labels), 
                        labels.float().view(-1, self.model.config.num_labels))
        return (loss, outputs) if return_outputs else loss

Note: I am actually using pos_weights. Why? Since I am dealing with imbalanced labels as said above I have a tensor which contains for each label a weight calculated as number of negative cases / positive cases.

The trainer then is

multi_trainer = MultilabelTrainer(
    model,
    args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    compute_metrics=compute_metrics,
    tokenizer=tokenizer)

My main question is: Does it actually make sense for what I have done? That is am I actually getting the right performance metrics taking into account I want to correct for imbalance? Or is there a better alternative (e.g. less verbose) to achieve this?

nielsr · November 17, 2021, 2:22pm

Hi,

I’ve created a notebook for you to illustrate this: Transformers-Tutorials/Fine_tuning_BERT_(and_friends)_for_multi_label_text_classification.ipynb at master · NielsRogge/Transformers-Tutorials · GitHub

Actually, there’s no need for a MultilabelTrainer anymore, as you can just set the problem_type of the model’s configuration to “multi_label_classification”.

RuudVelo · November 17, 2021, 3:59pm

Thank you very much!

patrickla · September 26, 2023, 4:55pm

Hey nielsr, your reference is pretty great, thank you!

I’m using your code as a reference to train a multi label classifier using mbart-50-large. Unfortunatelly, there are a lot of errors when I try to put both tokenizer and the model at GPU. Did you have any previous experiences using CUDA in this kind of task?

Topic		Replies	Views
Multilabel text classification Trainer API Beginners	8	22315	August 2, 2023
Metrics for Training Set in Trainer 🤗Transformers	11	26123	March 14, 2025
The Best Approach for Weighted Multilabel Classification 🤗Transformers	1	67	January 24, 2025
EvalPrediction has an unequal number of label_ids and predictions 😫 🤗Transformers	3	1292	June 19, 2024
Mullti Label Text Classification 🤗Transformers	2	1562	June 26, 2023

Multilabel classification performance metrics using Trainer API

Related topics