Feeding embeddings to `model.generate`

I am using GPT2 as the text generator for a video captioning model so instead of feeding GPT2 with token ids, I’m directly giving the video embeddings via input_embeds parameters.

Now during inference, to get the sentence predictions as output, I’m trying to use the .generate() function of GPT2 but I see that it only takes the token ids as inputs. Is there a way to give it the embeddings directly?