Data augmentation for image (ViT) using Hugging Face

lhoestq · October 27, 2021, 2:03pm

According to the set_transform documentation:

A formatting function is a callable that takes a batch (as a dict) as input and returns a batch.

so it will always be one dictionary that is passed you your transform, but the values in the dictionary are lists of size batch_size. Can you check if this is the case ?

If this is the case then you are good and you can process the examples by batch

But if you have only lists of 1 element, then the issue might come from the data loader.

Indeed, by default the pytorch data loader load batches of data from a dataset one by one like this:

batch = [dataset[idx] for idx in range(start, end)]

Therefore the augmentation function passed to set_transform is called batch_size times with one element. For the function to get more than one item per execution, it should be used like this instead:

batch = dataset[start:end]
# or
batch = dataset[list_of_indices]

I think you can change the pytorch data loading behavior to work this way if you use the BatchSampler

Let me know if that helps !

Topic		Replies	Views
Hugging face datasets and applying transformations Beginners	0	315	February 21, 2024
Image data augmentation - ViT Beginners	1	1209	July 28, 2022
Datasets - how to add augmentations? 🤗Datasets	1	625	October 25, 2023
How to ensure GPU utilisation when preprocessing huggingface datasets Beginners	1	742	April 27, 2024
Fine-tuning image classification with data augmentation using Trainer Beginners	0	1108	April 21, 2023

Data augmentation for image (ViT) using Hugging Face

Related topics