I think the title says it all. I’m trying to highlight the attention results between my image and the text generated by the model. however since I don’t gras fully the concept of attention and the model is complcated by nature I don’t understand what I must take to visualize the cross-attention. T…

How can one visualize the Cross-Attention of a VisionEncoderDecoderModel?

Th3Wh1t3Q November 7, 2023, 7:21am 3

I tried to put the attention maps on the image as a mask after rescaling but it wielded no results and I didn’t try any further for now.

Topic		Replies	Views
Transformers Attention Viz - Visualizing Cross-Modal Attention in CLIP/BLIP Models Show and Tell	0	37	July 21, 2025
How to visualize attention of a large encoder-decoder transformer model that isn't a model on hugging face? 🤗Transformers	0	2331	June 28, 2021
Seq2Seq Trainer plot attention maps 🤗Transformers	0	449	July 18, 2022
How to plot an attention map for Vision Transformer model Beginners	0	2188	April 12, 2024
How to get normal LLava-1.6 attention maps? 🤗Transformers	1	265	April 6, 2025