There are about 20,000+ images and text information in the local folder, and it took about 30 minutes to build an imagefolder dataset. The build process appears to be traversing the folders doing a series of confirmations. If there are billions of data, how to deal with it?
dataset = load_dataset('imagefolder', data_dir='/home/data/ms_coco/val2017/', streaming=True, ignore_verifications=True, cache_dir='/home/data/ms_coco/huggingface/valid' )
Please help me .