And want to use Transformers with deepspeed, and it seems that the two main ways are to use it either with the trainer or with the accelerate function.
The only difference I can find is that you should use accelerate only if you want to write your own training loop.
Is that the only difference?
Correct, as the Trainer runs on Accelerate now
1 Like
@muellerzr does that mean if accelerator is configured using accelerator config
and we load the model using
model = transformers.AutoModelForCausalLM.from_pretrained(
model_args.model_name,
torch_dtype=torch.bfloat16,
trust_remote_code=True,
device_map="auto"
)
trainer.train()
will use accelerate on the back-end to train/fine-tune the model on multiple GPUs and thereâs no need to write the custom training loop ?