How to customize "generate" function in Pretrained Models like BART?

For instance, I would like to concatenate two last encoder hidden states from two different texts and then enter them into cross attention module of decoder layer as the key and value in it. However, it seems hard to achieve this using existing generate function that could conduct beam search. I really want to know how to modify existing function to reach the aforementioned goal instead of achieving my own beam search from scratch.