How to run an end to end example of distributed data parallel with hugging face's trainer api (ideally on a single node multiple gpus)?

Oh cool. Does accelerate work even without hugging face (HF) models? I have a bunch of other code & models I’ve developed in pytorch over the years and it seems it’s nearly trivial to plug accelerate in.

Btw, I want to ask explicitly this too. Is accelerate compatible with HF’s trainer?

e.g.

+ from accelerate import Accelerator
+ accelerator = Accelerator()

+ model, optimizer, training_dataloader, scheduler = accelerator.prepare(
+     model, optimizer, training_dataloader, scheduler
+ )

trainer = Trainer(...model, optimizer, training_dataloader, scheduler...)
trainer.train()

(inspired from Accelerate)

Btw, thanks in advance for being so generous with your advice :slight_smile: