How to specify different batch sizes for different GPUs when training with rum_mlm.py?

Zet · July 22, 2021, 1:54am

Hi,

I’m using run_mlm.py to train a custom BERT model. I have two GPUs available. One has 24GB of memory and the other has 11 GB of memory. I want to use the batch size of 64 for the larger GPU and the batch size of 16 for the smaller GPU. How can I do so? The --per_device_train_batch_size parameter only takes one number. Or can I just give the combined batch size (80) and let the script figure out how to split the data between GPUs?

Thanks!

sgugger · July 26, 2021, 12:57pm

This use case is not supported by the Trainer API, it would require custom scripts (one per GPU) to work, and even then, I’m not sure you will see any speed gain by training on two different GPUs that are not of the same type.

Topic		Replies	Views
Clarifying multi-GPU memory usage Beginners	1	1415	November 5, 2020
Batch sizes / 2 GPUs + Windows 10 = 1 GPU? Beginners	6	3114	August 22, 2021
Model training in Multi GPU 🤗Transformers	1	1834	March 17, 2021
Trainer) training one batch with multiple GPUs DeepSpeed	0	397	June 19, 2023
Batch size TPUv4 Intermediate	0	378	November 4, 2022

How to specify different batch sizes for different GPUs when training with rum_mlm.py?

Related topics