I would like to train a transformer on images 512x512. I was looking at the huggingpics notebook as a starting point but failed to modify it to train on images with size different from 224x224.
I tried adding the “size” to ViTFeatureExtractor.from_pretrained (unknown keyword) and to ViTForImageClassification.from_pretrained (passed, but when it reached the trainer.fit fails with “ValueError: Input image size (512512) doesn’t match model (224224).”).
While i know how to do that in plain vanilla keras and pytorch, the hugginingface is just too user friendly for me!
Any help will be greatly appreciated.