im a little confused about where to apply transformations
I have a hugging face dataset of images and pass that to a pytorch dataloader. I’m already applying transformations in hugging face dataset to change the image size and normalizing the data and this overwrites the existing dataset values, this is working fine for me. However, I want to add some more transformations but rather than replacing them, i want to create additions to change the images slightly so i have slight variations for things like whether its night time or a brighter day. Is the best place to do this on the hugging face dataset (on the fly) or pytorches data loaders (on the fly) ? some things are unclear to me :
- if create a extra copy of the transformations so i have the orignal and the transformed copies of the images in the hugging face dataset. When looping through the data loader which say is set to 100 batches , would i actually get returned a batch 100 or 200?
- in what circumstances would you apply transformations at the hugging face dataset level
- in what circumstances would you apply transformations at the pytorch data loaders dataset level