About Prophetnet model n-gram loss calculation

kuyhgnas · December 28, 2023, 2:05pm

    def _compute_loss(self, logits, labels, ignore_index=-100):
        expend_targets = labels.new_zeros(self.config.ngram, labels.size(0), labels.size(1)).fill_(ignore_index)

        for i in range(self.config.ngram):
            if i > 0 and self.disable_ngram_loss:
                break
            expend_targets[i, :, :] = labels

I have question about calculating n-gram loss calculation.

following the compute_loss function, the shape of expend_targets represents [n_gram, batch, seq_len]
and the logits has shape [n_gram, batch, seq_len, vocab_size]

But reference code represents that expend_target copies same labels along first dim(n_gram).
I don’t understand why each n_gram logits have the same target(labels) ids.
I think there should be something shifting process for predicting future n-grams.

actually, it might be my misunderstanding.
Is there anyone who can explain this for me?

Topic		Replies	Views
Newbie Understanding GPT2 loss 🤗Transformers	1	5076	March 12, 2023
Source and target vs input and labels for causal autoregressive language models Beginners	1	1710	July 27, 2022
Reason for discrepancy between loss calculation in XLNetLMHeadModel and GPT2LMHeadModel 🤗Transformers	0	429	July 12, 2022
How to label dataset for Causal Language Modeling Beginners	0	520	January 27, 2023
Seq2Seq Loss computation in Trainer Beginners	9	5963	October 28, 2021

About Prophetnet model n-gram loss calculation

Related topics