Deepspeed script launcher vs accelerate script launcher for TRL

SantoshScienceIO · December 25, 2023, 2:15am

I’ve been trying to figure out the nature of the deepspeed integration, especially with respect to huggingface accelerate.

It seems that the trainer uses accelerate to facilitate deepspeed. But when I look at the documentation, it seems that we still use deepspeed as the launcher, or the pytorch distribute

deepspeed --num_gpus=2 your_program.py <normal cl args> --deepspeed ds_config.json

or

python -m torch.distributed.launch --nproc_per_node=2 your_program.py <normal cl args>

But it didn’t mention using the accelerate launcher. I’m confused by this since trainer uses accelerate to facilitate the deepspeed integration.

As for the TRL library, it seems that it’s using the accelerate library for it’s trainers as well, but for the TRL library, the official way to launch it is to use the accelerate launcher

accelerate launch --config_file=examples/accelerate_configs/deepspeed_zero{1,2,3}.yaml --num_processes {NUM_GPUS} path_to_script.py --all_arguments_of_the_script

I’m wondering if it’s still exchangeable with the deepspeed launcher, of it not, what’s the nature of the facilitation.

Topic		Replies	Views
Using deepspeed script launcher vs accelerate script launcher for TRL 🤗Accelerate	4	1910	January 24, 2024
Difference between using the Trainer class vs Accelerate library DeepSpeed	0	905	June 27, 2023
Exact difference between Transformers' and Accelerate's DeepSpeed integrations? DeepSpeed	5	816	February 13, 2024
Best practice to run DeepSpeed DeepSpeed	2	1560	December 25, 2023
Difference between accelerate/torch_distributed/deepspeed DeepSpeed	0	1385	April 25, 2022

Deepspeed script launcher vs accelerate script launcher for TRL

Related topics