Building an imagefolder dataset takes too long

There are about 20,000+ images and text information in the local folder, and it took about 30 minutes to build an imagefolder dataset. The build process appears to be traversing the folders doing a series of confirmations. If there are billions of data, how to deal with it?

dataset = load_dataset('imagefolder',

Please help me :grinning:.

Hi! Can you interrupt (CTRL + C or CMD + C) the process while waiting for it to finish and paste the returned error stack trace here to help us debug the issue? Also, what’s the output of the datasets-cli env command?