How does T5 create the correct decoder_input_ids?

lintang · September 19, 2020, 8:43pm

I have been reading the documentation for the T5 model and in the training section, for both unsupervised denoising and supervised training, the comment states the model is able to create the correct decoder_input_ids. However, I’m not sure how the model is able to do. The documentation tells me that if the decoder_input_ids is None (which it is by default) it takes the values of input_ids. Does this mean the start token is the first token of the input sequence?

The description also mentions a start-sequence token. But I can’t seem to find where this is generated or what the token is.

valhalla · September 20, 2020, 12:14pm

The T5PreTrainedModel._shift_right method here takes labels and prepares decoder_input_ids

Basically, all it does is,

takes labels -> add pad token at the beginning -> remove the eos -> decoder_input_ids

lintang · September 21, 2020, 1:30pm

Thanks for pointing this out for me!

Topic		Replies	Views
T5 models: About the decoder_input_ids argument Models	0	761	December 5, 2022
The meaning of 'decoder input ids' in encoder-decoder model Beginners	1	2397	July 29, 2022
What is the correct form of decoder_input_ids for LEDForConditionalGeneration? 🤗Transformers	1	711	July 5, 2021
Is there a way to return the "decoder_input_ids" from "tokenizer.prepare_seq2seq_batch"? 🤗Transformers	5	3350	December 29, 2020
T5 fine tuning, loss difference when using labels and decoder_input_ids 🤗Transformers	2	1177	October 12, 2020

How does T5 create the correct decoder_input_ids?

Related topics