EvalPrediction has an unequal number of label_ids and predictions 😫

The EvalPrediction object received in Trainers compute_metrics function returns an unequal number of label_ids and predictions.

My compute_metrics function given to the Trainer looks like this:
image

The shapes I print:
image

And the error message I get:

Does anyone of you have a solution to this? :hugs:

What does your data look like? I’m afraid there might have been some labels that were a list concatenated with single labels or something like that.

I found a solution to the problem!

I struggled with doing multi-label classification using the transformers Trainer until I found the following modification of the Trainer class here: Trainer

image

For some reason I also modified the Trainer for a multi class classification problem I have:
image

First, I discovered that there is no need for doing that, because the Trainer already does multi class by default :joy:

The problem seemed to be caused by my multi-class Trainer modification :sweat_smile:

The issue is coming because in the compute_loss function you output tensor instead of dict in hf Trainer the first sample is removed if it is not a dictionary.
reference here:

if isinstance(outputs, dict):
    logits = tuple(v for k, v in outputs.items() if k not in ignore_keys + ["loss"])
else:
    logits = outputs[1:]

I solver this issue like this:

class CustomTrainer(Trainer):

    def compute_loss(self, model, inputs, return_outputs=False):
        outputs = model(inputs['input_ids'], inputs['attention_mask'])
        
        loss = loss_fn(outputs, inputs['target'])
        return (loss, {"label": outputs}) if return_outputs else loss

here i return output in the form of dictionary and this solves the issue for me

1 Like