Vision Transformer reconstruct image

marcomameli01 · May 22, 2022, 5:30pm

Dear,
from the output of a ViT how I can reconstruct an image of features like the output of a CNN?

I need this kind of representation for a network that uses that representation for the information information.

marcomameli01 · July 21, 2022, 7:23am

Hello,
I need to obtain an output like the features images in the video of that image

to use that image as input for another piece of the network.

nielsr · July 21, 2022, 11:19am

Hi,

Topic		Replies	Views
Image Features as Model Input Beginners	2	959	November 18, 2020
Extract visual and contextual features from images Models	5	4486	August 27, 2021
Using trasnsformer to get image features 🤗Transformers	3	3385	March 20, 2024
Seq2Seq Trainer plot attention maps 🤗Transformers	0	458	July 18, 2022
How to convert ViTForMaskedImageModeling outputs to image Intermediate	1	603	August 23, 2022