Is CPU-offloading function in accelerate same with deepSpeed?

VIArchitect · June 19, 2023, 10:41am

Hi, I’m using the Accelerate framework to offload the weight parameters to CPU DRAM for DNN inference.

To achieve this, I’m referring to Accelerate’s device_map, which can be found at this link.
Handling big models for inference.

However, I recently came across another document discussing DeepSpeed’s Zero-3 offload, which seems to offer a similar function.

I’m wondering if these two approaches are the same or if there are any differences between them.
Specifically, am I using DeepSpeed just giving device_map when calling the pretrained model?

smangrul · June 23, 2023, 9:32am

Hello, no, they are both different. device_map is doing naive pipelining (different layers on different GPUs/CPU RAM/disk) while DeepSpeed does parameter+optimizer+gradient sharding across GPUs and then offloading those to partitions to CPUs. DeepSpeed Z3 is generally used for training. Accelerate’s device_map is generally used for big model inference.

VIArchitect · June 25, 2023, 11:15am

Oh yes, I know there’s a far more difference between just offloading parameters from GPU to CPU when training.
But I’m just using it within inference execution.
As far as I know, there’s no more optimization on DeepSpeed ZeRO-3 just offloading parameters to CPU DRAM, so I thought those two are the same. Isn’t it?

smangrul · June 28, 2023, 12:23pm

During inference too, there is clear difference between ZeRO-3 and device_map/naive_pipelining. ZeRO-3 inferences of different mini-batch on each of the GPUs leading to higher throughput whereas device_map infers the same batch while jumping across GPUs leading to lesser throughput.

VIArchitect · July 1, 2023, 8:36am

Yes, I think I have read that sentence in the device_map document, but when it comes to single GPU and fixed batch size than don’t we conclude that two are the same?

Topic		Replies	Views
Deepspeed ZeRO Inference DeepSpeed	1	2759	November 24, 2021
Accelerate not spreading on multiple CPUs 🤗Accelerate	1	1833	August 1, 2023
Run_mlm.py using --sharded_ddp "zero_dp_3 offload" gives AssertionError Intermediate	3	1178	April 21, 2021
Accelerate! I have a query, no actual problem to be solved! Beginners	2	286	August 8, 2023
How to accelerate.pepare() two different models based on different accelerate configs? 🤗Accelerate	3	1163	November 22, 2022

Is CPU-offloading function in accelerate same with deepSpeed?

Related topics