Popping `inputs[labels]` when self.label_smoother is not None (in trainer.py)

jbeh · November 11, 2021, 5:19am

Hi,

I was training my seq2seq model (I’m using Seq2seqTrainer) with label-smoothing and have encountered an error that input_ids was required in my training dataset, whereas I checked that I put them in the dataset.

While debugging it, I found that when self.label_smoother is not None, the labels item was popped out from inputs and the error came from outputs = model(**inputs) as shown in the following lines in trainer.py:

1872     def compute_loss(self, model, inputs, return_outputs=False):
1873         """
1874         How the loss is computed by Trainer. By default, all models return the loss in the first element.
1875 
1876         Subclass and override for custom behavior.
1877         """
1878         if self.label_smoother is not None and "labels" in inputs:
1879             labels = inputs.pop("labels")
1880         else:
1881             labels = None
1882         outputs = model(**inputs)

Question: is the line number 1879 intended? I think it would be either
labels = copy.deepcopy(inputs['labels']) or labels = inputs['labels']

I searched for this board but couldn’t find any similar post. That means other people are using the label-smoothing without any problem, which means I incorrectly understand the concept of the seq2seq training and label-smoothing.

Any comment would be greatly appreciated.

lewtun · November 11, 2021, 10:54am

Hey @jbeh can you share a minimal reproducible example? For example, something simple that just shows:

How you load and tokenize the datasets
How you define the training arguments
How you define the trainer

That will help us understand better what is causing the issue

sgugger · November 11, 2021, 1:26pm

The labels are popped because otherwise your model computes the losses twice, so two SoftMaxes, which is a very heavy operation. You need to pass along the decoder_input_ids when you want to use label smoothing with the Trainer, as generated by DataCollatorForSeq2Seq.

Topic		Replies	Views
Error in Seq2SeqTrainingArguments 🤗Transformers	3	940	May 30, 2023
How to accessing the input_ids in EvalPrediction.predictions in Seq2SeqTrainer? 🤗Transformers	5	2231	November 25, 2022
Label Smoothing in NLLB gives ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds Beginners	1	559	May 30, 2023
Creating Trainer object is deleting my 'labels' feature Beginners	3	1450	January 21, 2021
Trainer doesn't get to compute_metrics after upgrading to v4.32 🤗Transformers	4	1432	July 2, 2024

Popping `inputs[labels]` when self.label_smoother is not None (in trainer.py)

Related topics