Multi GPU fintuning BART

dakshvar22 · July 11, 2020, 11:37am

Hi,
I am trying to fine-tune the BART model checkpoints on a large dataset(around 1M data points). Since the dataset is large, I want to utilize a multi-GPU setup but I see that because of this line it’s not currently possible to train in a multi-gpu setting. Any work arounds for it?

@sshleifer Tagging you here since you’ve worked with BART and summarization in particular a lot on the repo.

sshleifer · July 11, 2020, 2:29pm

Which task are you finetuning on?

For sequence to sequence tasks, like summarization, examples/seq2seq/finetune.py supports multigpu for training only. There is a caveat: you have to run the final eval yourself on one GPU.

For language modeling tasks, multi-gpu is supported through the Trainer class.

dakshvar22 · July 11, 2020, 3:23pm

Thanks for the reply. It’s a seq2seq task but wouldn’t the assert condition fail during training if I specify multiple GPUs in the training command? Do you mean I can comment out that part and then run the script?
I am okay with the caveat.

valhalla · July 11, 2020, 4:53pm

hi @dakshvar22
you won’t need to comment that line, just set sortish_sampler argument to False, anyway, it’s False by default so you won’t need to change anything.

Topic		Replies	Views
Using trainer to train a bart model on 4 gpus failed 🤗Transformers	0	343	March 16, 2022
Finetuning BART on a multi-input sequence to sequence task 🤗Transformers	0	743	September 22, 2021
Seq2SeqTrainer Questions 🤗Transformers	12	5290	August 18, 2022
BART for sequence classification 🤗Transformers	1	322	December 3, 2020
Model trains with Seq2SeqTrainer but gets stuck using Trainer 🤗Transformers	4	1978	August 23, 2021

Multi GPU fintuning BART

Related topics