How does Segformer handle image size differences?

deppen8 · March 20, 2022, 1:49am

I have a set of 1024x1024 images and I am trying to fine-tune the segformer-b4-finetuned-cityscapes-1024-1024 pre-trained model for semantic segmentation.

The config for this model (linked above) says that the image size should be 224. If I set ignore_mismatched_sizes=True, I can pass it 1024x1024 images without a problem with seemingly pretty strong results.

I am wondering, though, what is happening behind the scenes? If the model expects a 224 image but receives a 1024 image, how is that handled? Is the image downsampled before being fed to the model? Is it chunked into 224 pixel inputs?

Additionally, the feature extractor for this model is seemingly set to resize to 512.

>>> feature_extractor = SegformerFeatureExtractor.from_pretrained("nvidia/segformer-b4-finetuned-cityscapes-1024-1024")`
>>> feature_extractor
SegformerFeatureExtractor {
  "do_normalize": true,
  "do_resize": true,
  "feature_extractor_type": "SegformerFeatureExtractor",
  "image_mean": [
    0.485,
    0.456,
    0.406
  ],
  "image_std": [
    0.229,
    0.224,
    0.225
  ],
  "reduce_labels": false,
  "resample": 2,
  "size": 512
}

Topic		Replies	Views
Exploring Segformer but its giving out Value error for input size, and expects to be 128x128 🤗Transformers	3	609	July 19, 2022
Small document issue for segformer? Models	1	407	December 7, 2021
Segformer train doesn't recognize downsized images Beginners	0	211	March 11, 2023
Changing the shap of the output of Segformer Models	1	582	September 6, 2023
Performance Issue Segformer Cityscapes 1024-1024 Models	0	396	July 4, 2023

How does Segformer handle image size differences?

Related topics