/downloads
contains the downloaded data files and /imagenet-1k
an .arrow
file generated from them (the images are in JPEG, so it’s hard to compress them further in this conversion from TAR to Arrow). Hence, the total size is twice the original dataset’s size.
Deleting /downloads
should work.
PS: Calling ds.cleanup_cache_files
deletes all the dataset’s cached .arrow
files besides ds.cache_files
(the ones that are memory-mapped)