I would like to use the ViT model for classification and adapt it to a regression task, is it feasible ?
Can the model work just by changing the loss function ? How can I define the classes in the _info method of my custom dataset since there is an infinity of them possible ? What are all the other changes to make ?
If you set the num_labels of the config to 1, it will automatically use the MSE loss for regression, as can be seen here. So yes, it’s totally possible.
What if I’m trying to predict x,y coordinates? I’m having trouble getting outputs that match (batch_size, 2) which is what my labels are. Where can I learn more about the outputs from config = ViTConfig.from_pretrained('google/vit-base-patch16-224', num_labels=) ?