Hey everyone, I’ve been scouring the internet the last few days trying to find an answer to the following question, but it’s still a bit unclear to me. When applying augmenting data using map or set_transform, I’ve noticed that the size of the training set does not increase, so I got confused as to what’s actually happening. In my mind, if it’s not adding additional data, then it doesn’t make sense. I think I may have an understanding now, though and I would appreciate if someone can confirm or correct my current understanding which is as follows

- Data is loaded into a Dataset and split into train, dev and test sets → DatasetDict.
- The set_transform method is applied to the dataset with whichever function has been passed as a parameter.
- Training data remains unchanged until model is trained.
- At each epoch, the transformations are applied to the input data, so the amount of training data stays constant, but variation is added through the transformations.
- Although variation is added, the addition to the training data would come from more epochs + constant several different transformations = better inference.

So in essence, the training data doesn’t actually get “augmented” in the sense that it becomes more, but instead there is a multiplier effect because of the transformations at each epoch, provided the number of epochs increases until performance peaks.

Is this the correct understanding of how data augmentation works for a ViT model using DatasetDict and set_transform?

Thank you!