Optimizing Disk Usage for Large (Audio) Datasets

I’d recommend you to try AudioFolder or streaming WebDataset which are well optimized already and don’t duplicate the data locally

1 Like