Need help understanding input of model in generation

Hi. I do not understand how the model in an Encoder-Decoder architecture handles the input on generation. In this line, the decoder receives as input decoder_input_ids, but if it is not provided, and neither are the targets, it seems that the decoder does not receive anything. In turn, here, in the generation, the model only receives input_ids, which should only be used for the encoder and then the problem with the decoder having no input would arise. Can anyone help me with this? Certainly, I am not seeing something.