Image dataset with_transform not applied

Hi,

I am training a computer vision model and want to apply AutoImageProcessor to prepare images for model.

When I use with_transform and the use a trainer, transformation is not applied.

Dataset contains PIL images.

from transformers import AutoImageProcessor
image_processor = AutoImageProcessor.from_pretrained('google/vit-base-patch16-224-in21k', use_fast=True)

def transform(example):
    ds = {}
    ds['image'] = image_processor(example, return_tensors='pt')['pixel_values'].reshape(3,224,224)
    return ds

dataset = dataset.with_transform(transform)

When I use .map it works, but I have a large dataset and map is taking too much space.

Any ideas why with_transform is not called at all? I also have tried with DataLoader and transformation is not applied.

Hi @dimitrije-it , with_transform works fine for the following snippet:

import datasets
print("Datasets version:", datasets.__version__)

from transformers import AutoImageProcessor
from datasets import load_dataset

image_processor = AutoImageProcessor.from_pretrained('google/vit-base-patch16-224-in21k', use_fast=True)
dataset = load_dataset("zh-plus/tiny-imagenet")["train"]

def transform(example):
    inputs = image_processor(example["image"], return_tensors='pt')['pixel_values'].reshape(3,224,224)
    return {
        "image": inputs,
        "label": example["label"]
    }

dataset = dataset.with_transform(transform)
print(dataset[0])
Datasets version: 2.20.0
{'image': tensor([[ 1.0000,  1.0000,  1.0000,  ...,  0.1373, -0.0039, -0.0039],
        [ 1.0000,  1.0000,  1.0000,  ...,  0.1373, -0.0039, -0.0039],
        [ 1.0000,  1.0000,  1.0000,  ...,  0.1294, -0.0118, -0.0118],
        ...,
        [-0.2863, -0.2863, -0.2863,  ..., -0.4039, -0.4118, -0.4118],
        [-0.2863, -0.2863, -0.2863,  ..., -0.4275, -0.4353, -0.4353],
        [-0.2863, -0.2863, -0.2863,  ..., -0.4275, -0.4353, -0.4353]]), 'label': 0}