How to fine-tune with unsloth using multiple GPUs as I'm getting out-of-memory error after running os.environ["CUDA_VISIBLE_DEVICES"]

pineapple96 · November 1, 2024, 3:26pm

0

I was trying to fine-tune Llama 70b on 4 GPUs using unsloth.

I was able to bypass the multiple GPUs detection by coda by running this command. It only detects 1 GPUs. os.environ[“CUDA_VISIBLE_DEVICES”]=“0”

However, when I was try to run fine-tune, the “trainer_starts = trainer.train()” threw CUDA out of memory error. It was taking only GPU 0 into the memory estimation → which is not enough for the fine-tuning.

How can we by pass this? Or is there another trick for this?

John6666 · November 2, 2024, 1:00am

I’ve never dealt with multi-GPU environments…

pineapple96 · November 3, 2024, 1:12pm

Thanks a lot for the references @John6666

Sebasdi · December 4, 2024, 9:02am

@pineapple96
Were you able to train on multi gpu with your trick?
I’m trying the same but without success yet.

Topic		Replies	Views
How to get the Trainer API to use GPU? Beginners	0	1572	May 21, 2021
Why is Trainer only using 1 (not 4) GPUs? Beginners	1	1635	June 2, 2022
CUDA OUT OF MEMORY on MULTI GPU 🤗Transformers	0	733	February 28, 2024
Using 3 GPUs for training with Trainer() of transformers 🤗Transformers	2	2342	October 18, 2023
Multi GPU Training with Trainer and TokenClassification Model 🤗Transformers	0	1534	July 21, 2023

How to fine-tune with unsloth using multiple GPUs as I'm getting out-of-memory error after running os.environ["CUDA_VISIBLE_DEVICES"]

Related topics