Hi ! How do you create your dataset ?
Right now every dataset is loaded from disk using memory mapping to not fill your RAM. However datasets created from in-memory data currently stay in memory.
So if you used Dataset.from_dict
for example you may want to write your dataset to disk to avoid filling up your RAM. You can use ds.save_to_disk()
and reload it with load_form_disk()
before calling your map function