Followed @philschmid blog on finetuning and then deployed the model to the endpoint with the below code and it returned the following error:
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
"code": 400,
"type": "InternalServerException",
"message": "\u0027llama\u0027"
}
Deployment Code:
from sagemaker.huggingface import HuggingFaceModel
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
model_data=model_s3, # Change to your model path
role=role,
transformers_version="4.26",
pytorch_version="1.13",
py_version="py39",
model_server_workers=1
)
predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type= "ml.g5.2xlarge",
endpoint_name='llama2-7b-1'
)