Why can't the bloom model be run (really slowly) on consumer hardware?

TonoTheHero · July 23, 2022, 7:37am

I read in the community for bloom on huggingfaces that to run inference you need around 400GB of GPU. Why can’t you just keep the stuff on an SSD and split the work 400/8 = 50 with an 8GB consumer GPU?

I’m sorry if this is a really dumb question.

sgugger · July 26, 2022, 12:29pm

As long as you have the disk space, the model can be run on any steup (albeit slowly) with Accelerate. Some users have run it on two GPUs for instance.

With just 8GB of RAM however you will be limited as it’s possible the largest layer of the model does not fit.

TonoTheHero · July 26, 2022, 4:10pm

Thank you!

Topic		Replies	Views
Prerequisite to run bloom locally? Beginners	8	12827	September 12, 2022
My QUESTION is how run a very big model like bloom on a cluster of machines? Research	0	286	May 26, 2023
Inference with BLOOMZ on CPU Models	0	295	February 22, 2023
How to run 30B meta model on two nodes with accelerate? 🤗Accelerate	6	2961	August 16, 2022
Different Inference Speed for same size models Models	0	389	August 29, 2021

Why can't the bloom model be run (really slowly) on consumer hardware?

Related topics