method 1 saves you some disk space, since it reads the image from the path you provided without copying the image on your disk
method 2 decodes your image completely and saves it on disk. The Array3D type makes it very fast to read the images, with zero-copy reads of the arrays from your disk. So you get high throughput at the cost of disk space.
method 3 doesn’t give better performance than 1, but it requires some processing and your map call copies the image in a new dataset, which consumes a bit of disk space
method 4 is same as 2, but since it doesn’t have any intermediate step you save some disk space. Note that you didn’t use Array3D for this one, but you probably should to get the best performance
As mentioned above, feel free to use the imagefolder data loader, and combined with with_transform(transform_to_tensor) you get a good tradeoff in performance vs disk space used
the training loop is terribly slow (time spending on disk + CPU data loading)
To improve data loading speed, feel free to use a PyToch DataLoader with num_workers > 1, this way the image decoding can be done in parallel in subprocesses.