Hello,
In the image classification example, custom transforms are applied on the evaluation dataset: transformers/run_image_classification.py at 6cb19540c9a5288195445e8355646c06605a64fc · huggingface/transformers · GitHub
Why not just use the feature_extractor
on the eval dataset?
Instead of transformers/run_image_classification.py at 6cb19540c9a5288195445e8355646c06605a64fc · huggingface/transformers · GitHub we would have
def val_transforms(example_batch):
"""Apply _val_transforms across a batch."""
example_batch["pixel_values"] = [feature_extractor(pil_img.convert("RGB")) for pil_img in example_batch["image"]]
return example_batch
Both do not yield the same output. Therefore, if using the model with feature_extractor
as the processing step and not custom transforms, we may have unwanted results.
Cheers!