Minimal changes for using DataParallel?

gpjt · June 17, 2024, 7:22pm

Update for anyone else with the same problems; I’m now 99% sure it was a ChatGPT hallucination. After much digging, it doesn’t appear to be possible to simply wrap a model in DataParallel and then use it with the Trainer.

I wound up changing the notebook so that it was a regular script, then running it with

torchrun --nproc_per_node=2 script.py

…and it worked fine.

My takeaway is that it doesn’t seem possible to do multi-GPU training inside a notebook, which is fine! I can build a simple model in a notebook then switch to using a script when I want to scale it up.

Topic		Replies	Views
Multi gpu training 🤗Transformers	3	6029	April 24, 2022
Inferences with DataParallel Beginners	3	5040	March 15, 2024
Trainer is not using multiple GPUs in the DP setup Beginners	0	826	April 9, 2023
Clarifying multi-GPU memory usage Beginners	1	1407	November 5, 2020
Which method is use HF Trainer with multiple GPU? 🤗Transformers	4	1564	June 19, 2023

Minimal changes for using DataParallel?

Related topics