Hello, I am running an example summarization training task taken from here (official HuggingFace example) on a multi-GPU machine, using the following versions: torch==1.11.0+cu113 and transformers==4.20.1. The only difference is that instead of using google/mt5-small as model I am using facebook/b…

HuggingFace summarization training example notebook raises two warnings when run on multi-GPUs

brando August 17, 2022, 3:38pm 6

Topic		Replies	Views
Not able to scale Trainer code to single node multi GPU 🤗Transformers	0	1130	September 14, 2023
RuntimeError: arguments are located on different GPUs 🤗Transformers	2	1868	October 24, 2020
How to run an end to end example of distributed data parallel with hugging face's trainer api (ideally on a single node multiple gpus)? Intermediate	17	18011	September 6, 2023
Get UserWarning: "Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector" when run official example code run_mlm.py 🤗Transformers	1	2203	November 17, 2022
Trainer warning with the new version 🤗Transformers	2	5805	January 2, 2025