And want to use Transformers with deepspeed, and it seems that the two main ways are to use it either with the trainer or with the accelerate function.
The only difference I can find is that you should use accelerate only if you want to write your own training loop.
Is that the only difference?
Correct, as the Trainer runs on Accelerate now
@muellerzr does that mean if accelerator is configured using
accelerator config and we load the model using
model = transformers.AutoModelForCausalLM.from_pretrained(
trainer.train() will use accelerate on the back-end to train/fine-tune the model on multiple GPUs and there’s no need to write the custom training loop ?