I have a dataset of roughly 44k data points and 1500 labels. I want to use “AutoModelForSequenceClassification” for classifying the labels. I am able to pass in “multi_label_classification” in the problem type. However, the F1 score and accuracy score is quite poor. I suspect it’s because the data is sparse and 0 labels are preferrred over 1 labels (because there are less number of 1s for all the categories).
Although this is a broad problem, In particular I was wondering how I can pass in the “pos_weight” parameter to “BCEWithLogitsLoss” function that the model uses during training through the TrainerAPI or Training Arguments?
I saw the source code and it looks like it doesn’t accept parameters.
Hello @ayush-adeptmind,
From my understanding of your question you want to introduce a weighted loss into your multilabel classification problem.
For such a problem, the best way to do I’m aware of while using transformers library is to override the Trainer class (create a new trainer class which inherits from the actual Trainer) and override (by defining it again) the “compute_loss” method.
When instanciating BCEWithLogitsLoss, you should be able to insert a weight vector.
See How can I use class_weights when training? for more precise information and code snippets.
Have a good day!
Adding exact code for my implementation based on the source provided. Thanks!
from typing import Optional
from torch import FloatTensor
from torch.nn import BCEWithLogitsLoss
import logging
class WeightedTrainer(Trainer):
def __init__(self, *args, class_weights: Optional[FloatTensor] = None, **kwargs):
super().__init__(*args, **kwargs)
if class_weights is not None:
class_weights = class_weights.to(self.args.device)
logging.info(f"Using multi-label classification with class weights", class_weights)
self.loss_fct = BCEWithLogitsLoss(pos_weight=class_weights)
def compute_loss(self, model, inputs, return_outputs=False):
"""
How the loss is computed by Trainer. By default, all models return the loss in the first element.
Subclass and override for custom behavior.
"""
labels = inputs.pop("labels")
outputs = model(**inputs)
try:
loss = self.loss_fct(outputs.logits.view(-1, model.num_labels), labels.view(-1, model.num_labels))
except AttributeError: # DataParallel
loss = self.loss_fct(outputs.logits.view(-1, model.module.num_labels), labels.view(-1, model.num_labels))
return (loss, outputs) if return_outputs else loss
trainer = WeightedTrainer(
model,
args,
train_dataset=ds_enc["train"],
eval_dataset=ds_enc["test"],
tokenizer=tokenizer,
compute_metrics=compute_metrics,
class_weights=weights
)