Error loading finetuned llama2 model while running inference

Hi Everyone! I’m having the same problem…
So it sounds like the Sagemaker Python SDK doesn’t have the most up to date “text generation interface” that is needed for LLaMA 2, are we able to get around this by deploying directly from the AWS Console or is there any way to use the sagemaker & huggingface packages to deploy without building an EC2 instance?

I’m also following the example linked in the original question and after having this issue with my adaptation of it, am currently trying to follow the example as-is.

Thanks!