Slow processing with map when using deepspeed or fairscale

stas · June 24, 2021, 4:07pm

What about DDP with the same number of processes?

python -m torch.distributed.launch --nproc_per_node=3

You will most likely see the same slow-down as now you have more than 1 process compete over your limited resources. So if the problem is the same, it’s then neither deepspeed nor fairscale but how many processes you use.

With deepspeed or fairscale, when debugging such problems, first always try to remove these from the equation, and do the same setup in straight pytorch.

Topic		Replies	Views
Dataset map function takes forever to run! 🤗Datasets	16	7238	August 15, 2024
How does `datasets.Dataset.map` parallelize data? Beginners	3	3262	August 5, 2024
Multiprocessing map taking too much memory footprint 🤗Datasets	17	6094	April 5, 2024
Datasets mapping slow down in the end 🤗Datasets	0	43	January 27, 2025
Using num_proc>1 in Dataset.map hangs 🤗Datasets	8	4266	August 19, 2024

Slow processing with map when using deepspeed or fairscale

Related topics