Gap Sentences Generation using Pegasus

Hi, I’m trying to use pretrained Pegasus in order to predict mask sentences, as in the paper and git using [mask1].
How to accomplish this wishful thinking is still a mystery for me, best I found so far is this snap of code:Link
But I’m having trouble understanding the example, why is the output (last_hidden_states.shape) [1,4,1024] 3 for the 3 words and one for ?
Is there an option to not use the decoder_inputs, i.e that the decoder will count on his prediction for next word?
how can the input be constructed when there are multiple masked sentences, I cant find any example showing that.
Thanks for your help!!