I was following the article here to train the llama2 13b model. But when I try to deploy with TGI container, I’m running into this error: #033[2m2023-07-25T19:44:12.555090Z#033[0m #033[31mERROR#033[0m #033[1mshard-manager#033[0m: #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Error when ini…

Yesterday night the new iamge got released: Release v1.0-hf-tgi-0.9.3-pt-2.0.1-inf-gpu-py39 · aws/deep-learning-containers · GitHub Its not yet available in the sagemaker-sdk but you can use the URI directly.

QLoRA trained LLaMA2 13B deployment error on Sagemaker using text generation inference image

I haven’t been successful with the new llm hosting version “0.8.2” in Sagemaker. However, I managed to deploy the output trained model using custom inference code (def model_fn(model_dir) and predict_fn(data, model_and_tokenizer).

I opened a similar issues here:

If you find any solutions please let me know.

Topic		Replies	Views
TGI version 0.9.3 llama2 13B deployment sagemaker RuntimeError Amazon SageMaker	2	670	September 12, 2023
QLoRA trained Mixtral 8x7B deployment error on Sagemaker using text generation inference image Amazon SageMaker	0	305	April 10, 2024
Getting an error when deploying llama2 7B on custom dataset using sagemaker inference endpoint Beginners	0	177	March 18, 2024
ValueError: Unsupported model type mllama Models	3	406	October 23, 2024
Is llama2 supported by the Hugging Face Text Generation Inference (TGI) Deep Learning Container on Amazon SageMaker? 🤗Transformers	0	537	August 3, 2023

QLoRA trained LLaMA2 13B deployment error on Sagemaker using text generation inference image

Related topics