ConvNextImageProcessor weird resize behaviour when input image is 224x224

cloudchief · September 10, 2024, 8:05am

Hi!

I am working on an image classification task and run into issues that the result of my trainer.predict() and the pipeline(…)results showed huge differences. I could identify the issue is in the image processor.

I am using ConvNeXTV2 which is using the ConvNextImageProcessor. My original input images are of the size 1200x1920px. As there is only relevant information in the center of the image I cropped manually to 1100x600px, resized to 224x224 and used that is input for training and validation. The results look good there.

I am using a 224 type (facebook/convnextv2-tiny-22k-224) and found out that the ConvNextImageProcessor is behaving differently when using 224 or 384 input size.

For shortest_edge=384 the images are just resized as they are which would be my expected behaviour. But for shortest_edge=224 there is more going on. The image is resized dependent on the crop_pct factor and then a square of 224x224 is cropped out and used.

In my case I am losing relevant information and the score goes down massively. Why is the the ConvNextImageProcessor behaving differently depending on the shortest_edge?

Also for training I just resized the images to 224x224. For inference the picture looks completely different when using the ConvNextImageProcessor as it is resizing with a locked ratio and then just cropping.

What is the right approach to handle that? Should I adapt the preprocessing of my training images to fit the ConvNextImageProcessor behaviour? But how should I know what exactly happens in each of the ImageProcessors.

Or should I just use a model that uses shortest_edge=384?

Hope you can help.

Topic		Replies	Views
Image classification: Why use both a transform and a processor to preprocess images? Beginners	4	150	September 12, 2024
Image size understanding in DinoV2 🤗Transformers	2	4217	December 21, 2023
What is ViTImageProcessor doing? Intermediate	3	1598	April 18, 2024
Get original image from trocr processor Intermediate	1	666	October 10, 2022
Changing resolution of transformer models for training 🤗Transformers	0	650	September 2, 2022

ConvNextImageProcessor weird resize behaviour when input image is 224x224

Related topics