Inference with VitMAE by providing a mask

ManuD · January 3, 2024, 10:29pm

Hi,

I am trying to use https://huggingface.co/docs/transformers/model_doc/vit_mae: More specifically, I have an image and a mask which specifies the parts of the image I’d like to reconstruct.

As I understand the paper, the model is designed for this tasks, but looking into the code and demos I always find that the masks is generated by the forward method of the mae model.

Is my understanding correct or am I missing some essential parts?
Is there a way to achieve my goal without changing too much on the original code?

Thanks for your help!

Topic		Replies	Views
Combining encoder from one model and a decoder for another for image reconstruction Beginners	0	341	December 15, 2022
Call ViTMAE Forward Embedding Models	1	296	March 30, 2023
How to use ViT MAE for image classification? 🤗Transformers	4	2302	December 3, 2024
Why does ViTForMaskedImageModeling not construct the original image correctly? Beginners	0	178	May 17, 2023
ViTMAEModel With model.eval(), get two different representations? 🤗Transformers	3	306	August 10, 2022

Inference with VitMAE by providing a mask

Related topics