Error loading finetuned llama2 model while running inference

Did you manage to run it with your own data ? I encountered the error reported in this thread when trying to deploy mine QLoRA trained LLaMA2 13B deployment error on Sagemaker using text generation inference image - #12 by rycfung

[EDIT] I managed to run it on my own model: for a Llama2 13B, you need to deploy on an ml.g5.12xlarge (which is a bit weird considering you can run inference on a notebook deployed on ml.g5.2xlarge :man_shrugging:).