How to use the generation_utils.generate?

i am doing image caption generation task´╝îthere are various implementation in different communities´╝îwhich have their own improvements´╝îi want to write a code that use the standard transformer structure´╝îand i referenced the code in The Annotated Transformer ´╝î
now i should implement the beam_search code for sequence generation´╝îi noticed the transformer library has a implementation in the generation_utils.generate, and thereÔÇÖs a lot document about text generation´╝îbut they are all pre-trained models and directly use the model.generate to generate sequence, but how should i use the generation_utils.generate to generate caption on my own datasets based on the standard transformer structure? Is there any examples or tutorials that I can refer to´╝č
thanks a lot.