Arrowmemoryerror: realloc of size 32 GB failed

Hi ! How do you create your dataset ?

Right now every dataset is loaded from disk using memory mapping to not fill your RAM. However datasets created from in-memory data currently stay in memory.

So if you used Dataset.from_dict for example you may want to write your dataset to disk to avoid filling up your RAM. You can use ds.save_to_disk() and reload it with load_form_disk() before calling your map function