ai.djl.engine.EngineException: GPU devices are not enough to run 2 partitions

Michelangiolo · July 24, 2023, 3:39am

Hi,

I am trying to deploy meta-llama/Llama-2-70b-chat-hf · Hugging Face on Sagemaker.
Following the code, it recommends me to use a “ml.g5.2xlarge” instance.
It seems a very small instance given the number of parameters, but that is what the code indicates.

However, the deployment (after about 15 mins) fails, outputting:
ai.djl.engine.EngineException: GPU devices are not enough to run 2 partitions.

What is the smallest instance I can run this model with, then?
If the problem is the number of GPU, the only alternative is using a multi-gpu 12x, which cost 5x, at a prohibitive cost. Does this problem applies to all LLM, or is there a small LLM that can run on g5.xlarge/g5.2xlarge?

Thank you

dhruvilHV · September 2, 2024, 10:06am

did you get the solution for this issue?

Topic		Replies	Views
LLM with 1048k hosted on sagemaker Amazon SageMaker	0	38	September 11, 2024
Cannot launch multi-gpu training? 🤗Transformers	0	718	September 14, 2023
Error hosting endpoint when deploying model Amazon SageMaker	2	3024	March 27, 2024
Some issues when training model on Sagemaker Amazon SageMaker	5	1291	November 24, 2021
How to deploy Sagemaker Multi-model Endpoints on GPU? Amazon SageMaker	0	385	December 14, 2023

ai.djl.engine.EngineException: GPU devices are not enough to run 2 partitions

Related topics