Hi,
I have tried to finetune SDXL model on a subset of LAION-aesthetic-5+ (89M) using the example code.
I used this code for loading data:
dataset = load_dataset(āimagefolderā, data_dir=args.train_data_dir, split=ātrainā)
args.train_data_dir
denote the data directory including over 89M image-text pairs.
But the data loading time is too long and eventually it failed to load the data.
For this large-scale Text-to-Image mode training, is there a more efficient way?
Thanks in advance