Seq2Seq Trainer plot attention maps

I’ve trained Vit+GPT2 image caption generation model. Now I want to plot the attention maps to see what the model is looking at based upon attention. But Seems like there is no way with seq2seq trainer API to deal with that or is there any?