My QUESTION is how run a very big model like bloom on a cluster of machines?

albe60 · May 26, 2023, 2:34pm

Hello i can run opt 66b on one server with 6 gpu 24 Gb by using your page on huggingface on how load big models : I give device_map. I can also run bloom on one server with 8 GPUs 24 GB by giving device_map but it uses offload on CPU and it takes time to answer. My QUESTION is how run a very big model like bloom on a cluster of machines indeed bloom would need 20 GPus 24 Gb and it needs a cluster of 3 machines with 8 gpus to deploy, with accelerate it is not possible as we are limited to only one machine. with Dp and ddp it is not possible as the model span on more than one machine I have tried everything, deep speed inference, RPC Framework, etc … Thanks for your help. Regards Pat

Topic		Replies	Views
Why can't the bloom model be run (really slowly) on consumer hardware? Models	2	558	July 26, 2022
Prerequisite to run bloom locally? Beginners	8	12828	September 12, 2022
How to load large model with multiple GPU cards? Beginners	8	43595	October 25, 2023
How to run 30B meta model on two nodes with accelerate? 🤗Accelerate	6	2962	August 16, 2022
BLOOM models don't run on my GPU using Transformers 🤗Transformers	1	1661	September 18, 2022

My QUESTION is how run a very big model like bloom on a cluster of machines?

Related topics