Hi,
I was training my seq2seq model (I’m using Seq2seqTrainer
) with label-smoothing and have encountered an error that input_ids
was required in my training dataset, whereas I checked that I put them in the dataset.
While debugging it, I found that when self.label_smoother
is not None
, the labels
item was popped out from inputs
and the error came from outputs = model(**inputs)
as shown in the following lines in trainer.py
:
1872 def compute_loss(self, model, inputs, return_outputs=False):
1873 """
1874 How the loss is computed by Trainer. By default, all models return the loss in the first element.
1875
1876 Subclass and override for custom behavior.
1877 """
1878 if self.label_smoother is not None and "labels" in inputs:
1879 labels = inputs.pop("labels")
1880 else:
1881 labels = None
1882 outputs = model(**inputs)
Question: is the line number 1879 intended? I think it would be either
labels = copy.deepcopy(inputs['labels'])
or labels = inputs['labels']
I searched for this board but couldn’t find any similar post. That means other people are using the label-smoothing without any problem, which means I incorrectly understand the concept of the seq2seq training and label-smoothing.
Any comment would be greatly appreciated.