T5 models: About the decoder_input_ids argument

tom192180 · December 5, 2022, 2:39pm

I read online tutorial about implementation fine-tuning from the website:

I do not know why for the decoder_input_ids and labels, the author removed last and first token ID, respectively.

See the following codes

for _, data in enumerate(loader, 0):
        y = data["target_ids"].to(device, dtype=torch.long)
       y_ids = y[:, :-1].contiguous()

        lm_labels = y[:, 1:].clone().detach()
        lm_labels[y[:, 1:] == tokenizer.pad_token_id] = -100
        ids = data["source_ids"].to(device, dtype=torch.long)
        mask = data["source_mask"].to(device, dtype=torch.long)
        outputs = model(
            input_ids=ids,
            attention_mask=mask,
            decoder_input_ids=y_ids,
            labels=lm_labels,
        )

I read the original paper and hugging-face documents but I still do not understand. Could anyone tell me?

Topic		Replies	Views
T5 fine tuning, loss difference when using labels and decoder_input_ids 🤗Transformers	2	1177	October 12, 2020
How does T5 create the correct decoder_input_ids? 🤗Transformers	2	2676	September 21, 2020
The meaning of 'decoder input ids' in encoder-decoder model Beginners	1	2395	July 29, 2022
What is the difference between lm_labels and decoder_input_ids Models	0	514	March 13, 2022
T5 - model.generate() issue Beginners	2	699	March 18, 2024

T5 models: About the decoder_input_ids argument

Related topics