How does T5 create the correct decoder_input_ids?

I have been reading the documentation for the T5 model and in the training section, for both unsupervised denoising and supervised training, the comment states the model is able to create the correct decoder_input_ids. However, I’m not sure how the model is able to do. The documentation tells me that if the decoder_input_ids is None (which it is by default) it takes the values of input_ids. Does this mean the start token is the first token of the input sequence?

The description also mentions a start-sequence token. But I can’t seem to find where this is generated or what the token is.

The T5PreTrainedModel._shift_right method here takes labels and prepares decoder_input_ids

Basically, all it does is,

takes labels -> add pad token at the beginning -> remove the eos -> decoder_input_ids

Thanks for pointing this out for me!