Encoder Decoder Loss

valhalla · March 15, 2021, 7:37am

the EncoderDecoder model calculates the standard auto-regressive cross-entropy loss using the labels i.e the output sequence. It just shifts the labels inside the models before computing the loss.

It’s the same loss used in other seq2seq models like BART, T5, and decoder models like GPT2.

Hope this helps.

Topic		Replies	Views
Understanding the encoder-decoder loss calculation VS CLM loss Beginners	0	346	February 21, 2024
Seq2Seq Loss computation in Trainer Beginners	9	6039	October 28, 2021
Does the transformer automatically shift by one position when calculating the autoregressive loss during the forward pass? Beginners	1	28	March 20, 2025
Getting CrossEntropy loss from beam search scores 🤗Transformers	0	402	June 21, 2022
GPT-2 shift logits and labels 🤗Transformers	5	5871	May 12, 2023

Encoder Decoder Loss

Related topics