Besides writing your own training loop, is there any other advantage for using it with deepspeed?

SantoshScienceIO · June 27, 2023, 8:04am

And want to use Transformers with deepspeed, and it seems that the two main ways are to use it either with the trainer or with the accelerate function.

The only difference I can find is that you should use accelerate only if you want to write your own training loop.

Is that the only difference?

muellerzr · June 27, 2023, 11:09am

Correct, as the Trainer runs on Accelerate now

capoorhimanshu · July 4, 2023, 1:07pm

@muellerzr does that mean if accelerator is configured using accelerator config and we load the model using

   model = transformers.AutoModelForCausalLM.from_pretrained(
   model_args.model_name,
   torch_dtype=torch.bfloat16,
   trust_remote_code=True,
   device_map="auto"
         )

trainer.train() will use accelerate on the back-end to train/fine-tune the model on multiple GPUs and there’s no need to write the custom training loop ?

Topic		Replies	Views
Exact difference between Transformers' and Accelerate's DeepSpeed integrations? DeepSpeed	5	811	February 13, 2024
Trainer and Accelerate 🤗Transformers	13	10164	September 19, 2024
How DeepSpeed interacts with Trainer optimizer DeepSpeed	1	1187	October 13, 2021
Decreasing performance when using Accelerate 🤗Accelerate	1	2253	March 8, 2022
Any documented examples of using DeepSpeed without trainer? Beginners	1	187	January 25, 2023

Besides writing your own training loop, is there any other advantage for using it with deepspeed?

Related topics