Using Transformers with DistributedDataParallel — any examples?

treeofknowledge · October 15, 2021, 12:45pm

Introduction for the Accelerate library says I have to be willing to write a forward loop (forgoing Trainer). Is there a way for me to enable DDP training while continuing using Trainer?

Replacing _get_train_sampler with _get_eval_sampler looks like a much more elegant solution, thank you!

Topic		Replies	Views
Which data parallel does trainer use? DP or DDP? 🤗Transformers	2	6335	August 17, 2022
Transformer model parallel does not work with Pytorch DDP for multi-node training 🤗Transformers	0	521	September 1, 2022
Trainer API for Model Parallelism on Multiple GPUs 🤗Transformers	5	4128	September 10, 2024
Problem in training iterable dataset 🤗Datasets	1	1026	December 26, 2023
Trainer default distributed training behaviour 🤗Transformers	2	20	May 15, 2025

Using Transformers with DistributedDataParallel — any examples?

Related topics