Image classification: Why use both a transform and a processor to preprocess images?

hi @Steyn-vanLeeuwen
I’m quoting from the tutorial

You might wonder why we pass along the image_processor as a tokenizer when we already preprocessed our data. This is only to make sure the image processor configuration file (stored as JSON) will also be uploaded to the repo on the hub.

I think you’re not doing two different preprocessing. You just take some numbers from image_processor to pass to other two functions:

normalize = Normalize(mean=image_processor.image_mean, std=image_processor.image_std)
if "height" in image_processor.size:
    size = (image_processor.size["height"], image_processor.size["width"])
    crop_size = size
    max_size = None
elif "shortest_edge" in image_processor.size:
    size = image_processor.size["shortest_edge"]
    crop_size = (size, size)
    max_size = image_processor.size.get("longest_edge")

For inference you can apply just image_processor as explained in the tutorial.

1 Like