Error loading finetuned llama2 model while running inference

abeiler · August 21, 2023, 3:06pm

Hi Everyone! I’m having the same problem…
So it sounds like the Sagemaker Python SDK doesn’t have the most up to date “text generation interface” that is needed for LLaMA 2, are we able to get around this by deploying directly from the AWS Console or is there any way to use the sagemaker & huggingface packages to deploy without building an EC2 instance?

I’m also following the example linked in the original question and after having this issue with my adaptation of it, am currently trying to follow the example as-is.

Thanks!

Topic		Replies	Views
ValueError: Could not load model /opt/ml/model with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForCausalLM'>, <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>) Amazon SageMaker	0	405	March 13, 2024
QLoRA trained LLaMA2 13B deployment error on Sagemaker using text generation inference image Amazon SageMaker	14	3023	August 18, 2023
Error hosting endpoint when deploying model Amazon SageMaker	2	3108	March 27, 2024
Inference failed for FLAN-UL2(20B) on SageMaker Amazon SageMaker	6	2207	April 4, 2023
Deploying Fine-Tune Falcon 40B with QLoRA on Sagemaker Inference Error Amazon SageMaker	29	6935	January 8, 2024

Error loading finetuned llama2 model while running inference

Related topics