Multiprocessing map taking too much memory footprint

A dataset that comes from memory (e.g. using .from_dict()) doesn’t have a cache file yet, so if you want your map() to write on disk instead of filling up your memory you should pass a cache_file_name to map().

Note that at one point we might allocate a cache automatically to such datasets in memory to align with the general behavior.