Hello,
I’d like to update my training script using Seq2SeqTrainer
to match the newest version, v4.2.1.
My code worked with v3.5.1.
However, when I update it, it doesn’t work with v4.2.1.
It is said that ValueError
occurs.
File "/****/seq2seq_trainer.py", line 193, in compute_loss
loss, _ = self._compute_loss(model, inputs, labels)
File "/****/seq2seq_trainer.py", line 180, in _compute_loss
loss = self.loss_fn(logits.view(-1, logits.shape[-1]), labels.view(-1))
ValueError: Expected input batch_size (464) to match target batch_size (480).
I tried print
debug,
inserted:
def _compute_loss(self, model, inputs, labels):
if self.args.label_smoothing == 0:
if self.data_args is not None and self.data_args.ignore_pad_token_for_loss:
# force training to ignore pad token
logits = model(**inputs, use_cache=False)[0]
print(inputs["input_ids"].shape)
print(logits.shape)
print(labels.shape)
loss = self.loss_fn(logits.view(-1, logits.shape[-1]), labels.view(-1))
and got:
torch.Size([8, 58])
torch.Size([8, 58, 50266])
torch.Size([8, 60])
(I added my own special token, so the embedding size becomes 50266)
Am I forgetting to do the necessary processing when updating the file to fit the new version?
In the Seq2SeqDataCollator
, it seems that shift_tokens_right
, which was imported from transformers.models.bart.modeling_bart
is no longer needed.
I update my own DataCollator on the basis of this new Seq2SeqDataCollator
, and I think something I’m misunderstanding is related to here.
Thank you in advance.