How long it takes to train Falcon 7B model using RTX 4090 GPU?

thefreeman · November 24, 2023, 6:38am

Hi,
This is my first question here and sorry if it’s asked before or looks very basic. But I am interested to find a simple formula to estimate how long it would take if we wanted to train Falcon 7B model(Or any other models like 40B or GPT 3, etc.) on a single 4090 GPU?

I know it maybe insane doing such a thing but as an AI student/hobbyst with tight budget it’s always interesting to know what models we could train in our single, dual, quad GPUs at home?

Is there a simple formula we can put our GPU memory and speed, also the models parameters or size to get a rough estimation about the training time?

sgtflame · January 31, 2024, 5:43pm

There is one rule of thumb in my experience (and I’m only using MPT models) it takes about 12x as much GPU memory as the size of the model. There are some tricks, but I’ve not tried them.

As for time, that’s not a great question. You can train a model in an hour, but it’s not going to be very good. You continue training the model and testing checkpoints until it’s good enough. I have been training for several months (coded as a screen saver, so it only runs when I’m not using the computer), on a 3090 GPU; it’s not terrible, but the MPT-7B is still better for most of my use cases. I’ve not used the Falcon 7B model, but I’d hazzard a guess that it’s probably better, too.

Good luck and have fun!

fahraynk · February 17, 2024, 10:08pm

@sgtflame For one 4090 with 24GB VRAM your 12x rule means you can train max a 2GB model? So since mpt-7B looks like it is about 10 GB, with your 12x rule that means it is impossible to train on a build even with two 4090s? Even if you train the model for like 6 months?
Where can I read more about the model size to resources needed to train?

nielsr · February 18, 2024, 9:14am

Hi,

I recommend reading Methods and tools for efficient training on a single GPU, which includes many tips and tricks to train your models efficiently on a single GPU. Memory usage is explained in detail at Model training anatomy.

Regarding training (fine-tuning) a 7B model on a single RTX 4090 GPU (which has 24GB of RAM), that is only possible using either LoRa or QLoRa, which freeze the base model (either in half precision or in 4 bits) and train adapters on top of it. I have a notebook on that here.

Topic		Replies	Views
Can I keep model in GPU vram and then iterate and change my program without re-uploading? Beginners	2	324	February 21, 2024
More GPUs = lower performance? Beginners	1	521	December 31, 2020
Using 2 GPUs out of 4 Beginners	0	274	February 28, 2024
Trouble with fine-tuning falcon 7b Beginners	0	125	March 11, 2024
Cuda out of memory issue training whisper model on single GPU Intermediate	0	907	December 15, 2023

How long it takes to train Falcon 7B model using RTX 4090 GPU?

Related topics