I struggled with doing multi-label classification using the transformers Trainer until I found the following modification of the Trainer class here: Trainer
For some reason I also modified the Trainer for a multi class classification problem I have:
First, I discovered that there is no need for doing that, because the Trainer already does multi class by default
The problem seemed to be caused by my multi-class Trainer modification
The issue is coming because in the compute_loss function you output tensor instead of dict in hf Trainer the first sample is removed if it is not a dictionary.
reference here:
if isinstance(outputs, dict):
logits = tuple(v for k, v in outputs.items() if k not in ignore_keys + ["loss"])
else:
logits = outputs[1:]
I solver this issue like this:
class CustomTrainer(Trainer):
def compute_loss(self, model, inputs, return_outputs=False):
outputs = model(inputs['input_ids'], inputs['attention_mask'])
loss = loss_fn(outputs, inputs['target'])
return (loss, {"label": outputs}) if return_outputs else loss
here i return output in the form of dictionary and this solves the issue for me