Is llama2 supported by the Hugging Face Text Generation Inference (TGI) Deep Learning Container on Amazon SageMaker?

jackkwok · August 3, 2023, 9:30pm

I attempted to deploy llama-2-70b-chat-hf model to SageMaker the Hugging Face Text Generation Inference (TGI) Deep Learning Container and got this error:

UnexpectedStatusException: Error hosting endpoint Llama-2-70b-chat-hf-2023-08-03-22-10-57: Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint..

I see these officially supported model architectures (before llama2 was released):

BLOOM / BLOOMZ
MT0-XXL
Galactica
SantaCoder
GPT-Neox 20B (joi, pythia, lotus, rosey, chip, RedPajama, open assistant)
FLAN-T5-XXL (T5-11B)
Llama (vicuna, alpaca, koala)
Starcoder / SantaCoder
Falcon 7B / Falcon 40B

Source: Introducing the Hugging Face LLM Inference Container for Amazon SageMaker

Topic		Replies	Views
Error loading finetuned llama2 model while running inference Amazon SageMaker	27	4823	September 20, 2023
Error hosting endpoint when deploying model Amazon SageMaker	2	3088	March 27, 2024
Error code 400 when running llama2 on sagemaker endpoint Amazon SageMaker	1	1237	July 24, 2023
QLoRA trained LLaMA2 13B deployment error on Sagemaker using text generation inference image Amazon SageMaker	14	2996	August 18, 2023
Deploying TheBloke/Luna-AI-Llama2-Uncensored-GGML Amazon SageMaker	0	854	September 11, 2023

Is llama2 supported by the Hugging Face Text Generation Inference (TGI) Deep Learning Container on Amazon SageMaker?

Related topics