Seq2Seq Loss computation in Trainer

dpernes · October 28, 2021, 1:01pm

That’s actually a mistake in the documentation, it should be “by shifting the labels” instead of “by shifting the input_ids”. Can you open a PR to fix this?

Sure, I will

Seems like the implementation is correct

Yes, now everything makes sense, thank you!

Topic		Replies	Views
T5 fine tuning, loss difference when using labels and decoder_input_ids 🤗Transformers	2	1186	October 12, 2020
Encoder Decoder Loss 🤗Transformers	6	9041	October 14, 2021
Popping `inputs[labels]` when self.label_smoother is not None (in trainer.py) Beginners	2	1308	November 11, 2021
Is there a way to return the "decoder_input_ids" from "tokenizer.prepare_seq2seq_batch"? 🤗Transformers	5	3358	December 29, 2020
Could I inference the Encoder-Decoder model without specify "decoder_input_ids"? 🤗Transformers	4	2467	May 1, 2021

Seq2Seq Loss computation in Trainer

Related topics