Thanks @lhoestq, unfortunately, it’s the same even when I try with the smallest possible values for N=10000
. Could it be that I’m making some mistake somewhere else in my code (I mean the provided minimal example).
ds = ds.map(
preprocess_function,
remove_columns='audio',
batch_size=1,
writer_batch_size=1
)