How to run an end to end example of distributed data parallel with hugging face's trainer api (ideally on a single node multiple gpus)?

Yes, quite so. E.g. its currently whats used for fastai’s entire distributed module now. If its PyTorch, it can be done with Accelerate.

The repo has a cv example using timm

Not right now, no. Since Trainer handles all the DDP Accelerate can do internally. But as mentioned earlier, you can still use accelerate to launch those scripts :slight_smile:

1 Like