Error loading finetuned llama2 model while running inference

marmikpandya · August 2, 2023, 10:03am

To anyone else facing this problem, it works totally fine on a plain old EC2 instance with TGI v1.0.0. Which would be because text generation interface added support for Llama2 in v0.9.3 while sagemaker python sdk only recognises upto 0.8.2. I used g4dn.12xlarge instance.

Topic		Replies	Views
ValueError: Could not load model /opt/ml/model with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForCausalLM'>, <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>) Amazon SageMaker	0	405	March 13, 2024
QLoRA trained LLaMA2 13B deployment error on Sagemaker using text generation inference image Amazon SageMaker	14	3023	August 18, 2023
Error hosting endpoint when deploying model Amazon SageMaker	2	3108	March 27, 2024
Inference failed for FLAN-UL2(20B) on SageMaker Amazon SageMaker	6	2207	April 4, 2023
Deploying Fine-Tune Falcon 40B with QLoRA on Sagemaker Inference Error Amazon SageMaker	29	6935	January 8, 2024

Error loading finetuned llama2 model while running inference

Related topics