Custom Loss: compute_loss() got an unexpected keyword argument 'return_outputs'

Hello,

I have created my own trainer with a custom loss function;

from torch.nn import CrossEntropyLoss

device = torch.device("cuda")
class_weights = torch.from_numpy(class_weights).float().to(device)

class MyTrainer(Trainer):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)    

def compute_loss(self, model, inputs):
    labels = inputs.pop("labels")
    outputs = model(**inputs)
    logits = outputs[0]
    loss = CrossEntropyLoss(weight=class_weights)
    return loss(logits, labels)

Yet, after Training for an hour, It went to do an evaluation and I got this error;

TypeError: compute_loss() got an unexpected 
keyword argument 'return_outputs'

I don’t have compute_loss() variable within my code so I don’t think it was me inputting something to it. I was thinking perhaps it has to do with my custom loss?

by the way, this is my trainer:

trainer = MyTrainer(
    args=training_args,                  # training arguments, defined above
    train_dataset=train_dataset,         # training dataset
    eval_dataset=val_dataset,             # evaluation dataset
    model_init=model_init,
   compute_metrics=compute_metrics,

)

Hi @theudster, I ran into just this problem today :slight_smile:

The solution is to change the signature of your compute_loss to reflect what is implemented in the source code:

def compute_loss(self, model, inputs, return_outputs=False):
    ...
    return (loss, outputs) if return_outputs else loss

It seems that the example in the Trainer docs is not up-to-date so I suggest inspecting the source code of compute_loss as a reference for now

2 Likes

First of all, thank you @lewtun for all your input, you have been helping me loads the past few days and I am really grateful! This forum is amazing.
Just one thing, I am not too sure about what’s in the return, because the loss and the output are there
Would it be

return (loss(logits, labels), outputs) if return_output else loss

or we don’t need to call the logits and labels into loss, like what you did?

2 Likes

Happy to help :slight_smile:

I think in your case what you might need is something like

return (loss(logits, labels), outputs) if return_output else loss(logits, labels)

since my understanding is that your loss object is really a loss function and we should be returning a scalar in compute_loss. As a sanity check you can try feeding some inputs to your function to see what the outputs look like, e.g.

trainer.compute_loss(model, inputs, return_outputs=False)

PS the reason we now need return_outputs in the signature is because it’s used in the prediction_step function to get the logits during training: transformers.trainer — transformers 4.3.0 documentation

:frowning: very strange… I am still getting an error in the prediction step…

AttributeError: 'NoneType' object has no attribute 'detach'

I cants think what may be the issue and perhaps im thinking its related to class_weights which I used torch.from_numpy() on

can you share the full stack trace to see what’s causing this?

AttributeError                            Traceback (most recent call last)
<ipython-input-21-3435b262f1ae> in <module>()
----> 1 trainer.train()

7 frames
/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in train(self, resume_from_checkpoint, trial, **kwargs)
    987 
    988             self.control = self.callback_handler.on_epoch_end(self.args, self.state, self.control)
--> 989             self._maybe_log_save_evaluate(tr_loss, model, trial, epoch)
    990 
    991             if self.args.tpu_metrics_debug or self.args.debug:

/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in _maybe_log_save_evaluate(self, tr_loss, model, trial, epoch)
   1056         metrics = None
   1057         if self.control.should_evaluate:
-> 1058             metrics = self.evaluate()
   1059             self._report_to_hp_search(trial, epoch, metrics)
   1060 

/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in evaluate(self, eval_dataset, ignore_keys, metric_key_prefix)
   1511             prediction_loss_only=True if self.compute_metrics is None else None,
   1512             ignore_keys=ignore_keys,
-> 1513             metric_key_prefix=metric_key_prefix,
   1514         )
   1515 

/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in prediction_loop(self, dataloader, description, prediction_loss_only, ignore_keys, metric_key_prefix)
   1628 
   1629         for step, inputs in enumerate(dataloader):
-> 1630             loss, logits, labels = self.prediction_step(model, inputs, prediction_loss_only, ignore_keys=ignore_keys)
   1631             if loss is not None:
   1632                 losses = loss.repeat(batch_size)

/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in prediction_step(self, model, inputs, prediction_loss_only, ignore_keys)
   1760 
   1761         if has_labels:
-> 1762             labels = nested_detach(tuple(inputs.get(name) for name in self.label_names))
   1763             if len(labels) == 1:
   1764                 labels = labels[0]

/usr/local/lib/python3.7/dist-packages/transformers/trainer_pt_utils.py in nested_detach(tensors)
    109     "Detach `tensors` (even if it's a nested list/tuple of tensors)."
    110     if isinstance(tensors, (list, tuple)):
--> 111         return type(tensors)(nested_detach(t) for t in tensors)
    112     return tensors.detach()
    113 

/usr/local/lib/python3.7/dist-packages/transformers/trainer_pt_utils.py in <genexpr>(.0)
    109     "Detach `tensors` (even if it's a nested list/tuple of tensors)."
    110     if isinstance(tensors, (list, tuple)):
--> 111         return type(tensors)(nested_detach(t) for t in tensors)
    112     return tensors.detach()
    113 

/usr/local/lib/python3.7/dist-packages/transformers/trainer_pt_utils.py in nested_detach(tensors)
    110     if isinstance(tensors, (list, tuple)):
    111         return type(tensors)(nested_detach(t) for t in tensors)
--> 112     return tensors.detach()
    113 
    114 

AttributeError: 'NoneType' object has no attribute 'detach'

ok it seems there is a problem with the nested_detach function. do your inputs have a labels field (which seems to be the default for self.label_names)?

if that doesn’t solve the problem, my suggestion would be to try debug the prediction_step function by getting a batch of inputs and seeing what happens when you apply nested_detach to them

I am only trying to do this because my dataset is unbalanced and I want the loss function to take that into consideration…

Im not even sure what input is, from what I understand it is the data from the training data loader, if the so, it should have a labels field, seeing I defined it as follows

import torch 

class SequenceDataset(torch.utils.data.Dataset):
    def __init__(self, encodings, labels):
        self.encodings = encodings
        self.labels = labels
        
    def __getitem__(self, idx):
        item = {key: torch.tensor(val[idx]) for key, val in self.encodings.items()}
        item['label'] = torch.tensor(self.labels[idx])
        return item
    
    def __len__(self):
        return len(self.labels)

I see you run into the same issue in the end of this notebook. Did you find a solution?

Oh thanks for flagging the example in the doc was not up to date. If any of you want to open a PR to fix it that would be awesome!

For the labels problem, I think from what I see in the stack trace that you do not have an install from source. There was a bug recently fixed (basically your compute_loss function pops the labels out of the inputs so they are not there anymore, now we copy them before the compute_loss to avoid that).

2 Likes

Thanks @sgugger! I forgot that

pip install git+https://github.com/huggingface/transformers.git

is an effective debugging tactic :smiley:

Here’s the PR with the fix: Fix example of custom Trainer to reflect signature of compute_loss by lewtun · Pull Request #10537 · huggingface/transformers · GitHub

1 Like

Thank you! good to know about the source install

1 Like

Hi @lewtun & @sgugger ! I have been having a similar issue with compute_loss () for WeightedLossTrainer. Here is the code for the WeightedLossTrainer

class WeightedLossTrainer(Trainer):
      def compute_loss(self, model ,inputs, return_outputs=False):
           outputs = model(**inputs)
           logits = outputs.get("logits")
           labels= inputs.get("labels")
           loss_func=torch.nn.CrossEntropyLoss(weight=TorchWeights)
           loss = loss_func(logits,labels)  
           return (loss(logits, labels), outputs) if return_output else loss(logits, labels)

This is the code for tokenizing and aligning labels

def tokenize_and_align_labels(examples):
    tokenized_inputs = tokenizer(
        examples["tokens"], truncation=True,padding=True,is_split_into_words=True,return_offsets_mapping=True
)
    all_labels = examples["ner_tags"]
    new_labels = []
    for i, labels in enumerate(all_labels):
        word_ids = tokenized_inputs.word_ids(i)
        new_labels.append(align_labels_with_tokens(labels, word_ids))

    tokenized_inputs["labels"] = new_labels
    return tokenized_inputs

Here is a screenshot of the error :

Any help? I can share the full code if needed.

Best,
Ghadeer