Is llama2 supported by the Hugging Face Text Generation Inference (TGI) Deep Learning Container on Amazon SageMaker?

I attempted to deploy llama-2-70b-chat-hf model to SageMaker the Hugging Face Text Generation Inference (TGI) Deep Learning Container and got this error:

UnexpectedStatusException: Error hosting endpoint Llama-2-70b-chat-hf-2023-08-03-22-10-57: Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint..

Is llama2 supported by the Hugging Face Text Generation Inference (TGI) Deep Learning Container on Amazon SageMaker?

I see these officially supported model architectures (before llama2 was released):

  • BLOOM / BLOOMZ

  • MT0-XXL

  • Galactica

  • SantaCoder

  • GPT-Neox 20B (joi, pythia, lotus, rosey, chip, RedPajama, open assistant)

  • FLAN-T5-XXL (T5-11B)

  • Llama (vicuna, alpaca, koala)

  • Starcoder / SantaCoder

  • Falcon 7B / Falcon 40B

Source: Introducing the Hugging Face LLM Inference Container for Amazon SageMaker