How to load this simple audio data set and use dataset.map without memory issues?

Can you check ds.cache_files ? Since you loaded the dataset from memory using .from_pandas, then the dataset has no associated cache directory to save intermediate results.

To fix this you can specify cache_file_name in .map(), this way it will write the results on your disk instead of using memory :wink:

5 Likes