Hey, I am trying to train a custom model (which inherits from PreTrainedModel) with IterableDataset using the HuggingFace Trainer in a DDP setup and I have a couple of questions on how to do it best as well as some of my observations. I know there is a datasets.distributed.split_dataset_by_node() …

How to handle IterableDataset with HuggingFace trainer and num_workers in DDP setup

proj-persona September 26, 2024, 6:51am 6

But if split by node, the trainer should not skip examples?

So, how do you implement fast processing in DDP?

Topic		Replies	Views
Trainer default distributed training behaviour 🤗Transformers	2	50	May 15, 2025
Problem in training iterable dataset 🤗Datasets	1	1058	December 26, 2023
Training with IterableDataset is very slow when using a large number of workers 🤗Transformers	0	1316	August 19, 2023
How does Huggingface Trainer handle Iterable dataset on TPU? Intermediate	0	431	February 16, 2022
Keeping IterableDataset node-wise split fixed during DDP 🤗Datasets	8	2045	April 29, 2024