ImageFolder’s file resolution is currently not optimized for large datasets like this one. In your case, it’s best to create a dataset loading script or use Dataset.from_generator
(with a generator that yields {"image": pil_image, "text": text}
dictionaries) instead of load_dataset
to generate the dataset.
2 Likes