I’ve trained Vit+GPT2 image caption generation model. Now I want to plot the attention maps to see what the model is looking at based upon attention. But Seems like there is no way with seq2seq trainer API to deal with that or is there any?
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
How to plot an attention map for Vision Transformer model | 0 | 1897 | April 12, 2024 | |
Understanding attention output from generate method in GPT model | 0 | 574 | November 8, 2023 | |
Img2seq model with pretrained weights | 7 | 1192 | November 18, 2021 | |
Image captioning for Japanese with pre-trained vision and text model | 0 | 1147 | June 23, 2021 | |
Poster2Plot: Generate Movie/T.V show plot from poster | 5 | 1237 | November 24, 2021 |