Working with large datasets - cache issues

The .map() function creates a cache file 100 times larger then the original dataset file.
Can this behaviour be somehow avoided?
Note that it happens when using “num_proc=48”