So these are the 2 tutorials I am looking at for Image Classification using ViTs:
However, the first one from Huggingface uses trainer.evaluate() to output the metrics, while AI Summer uses trainer.predict(). Is there any substantial difference between the two or are they interchangeable?
FYI, the models I am using are ‘google/vit-base-patch16-224-in21k’, ‘microsoft/cvt-13’, ‘microsoft/resnet-50’, and ‘facebook/convnext-base-224-22k’.