Error deploying endpoint on Aws

Knour13 · August 23, 2024, 9:01am

I’m trying to deploy my finetuned LLama3 model on Aws, so the first step to create an endpoint
I used instance_type=“ml.g5.4xlarge”,
This is my code:

import json
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri

try:
role = sagemaker.get_execution_role()
except ValueError:
iam = boto3.client(‘iam’)
role = iam.get_role(RoleName=‘sagemaker_execution_role’)[‘Role’][‘Arn’]

Hub Model configuration. https://huggingface.co/models

hub = {
‘HF_MODEL_ID’:‘Guepard/knaine_llama3.1_v0’,
‘HF_TASK’: ‘text-generation’
}

create Hugging Face Model Class

huggingface_model = HuggingFaceModel(
image_uri=get_huggingface_llm_image_uri(“huggingface”,version=“2.0.2”),
env=hub,
role=role,
)

deploy model to SageMaker Inference

predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type=“ml.g5.4xlarge”,
container_startup_health_check_timeout=1200,
)

send request

predictor.predict({
“inputs”: “My name is Julien and I like to”,
})

I get the following error :

UnexpectedStatusException: Error hosting endpoint huggingface-pytorch-tgi-inference-2024-08-23-08-25-25-823: Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint… Try changing the instance type or reference the troubleshooting page Troubleshooting - Amazon SageMaker

Any solution?

mahmutc · August 23, 2024, 9:12am

hi @Knour13
Did you check CloudWatch logs?

github.com/aws/amazon-sagemaker-examples

Failed Reason: The primary container for production variant AllTraffic did not pass the ping health check.

opened 04:26PM - 08 Mar 18 UTC

closed 09:46PM - 19 Mar 18 UTC

dtsukiyama

I am trying to deploy a BYOB (bring your own model) keras model. I pushed the im…age to ECR with the 'latest' tag. All local testing passed, and I am able to successfully train the model e.g.: ```python image = '{}.dkr.ecr.{}.amazonaws.com/my-model:latest'.format(account, region) dl = sage.estimator.Estimator(image, role, 1, 'ml.c4.2xlarge', output_path="s3://{}/output".format(sess.default_bucket()), sagemaker_session=sess) ``` However attempting to deploy gives me the error: ``` Failed Reason: The primary container for production variant AllTraffic did not pass the ping health check. ``` I am not quite sure where this stems from given local health check passed. Any insight would be great! Thanks.

Knour13 · August 23, 2024, 11:29am

I found in cloud watch these errors:

Knour13 · August 23, 2024, 11:31am

@philschmid Hi Do you have any solution ?
Thanks

philschmid · August 23, 2024, 11:52am

Try TGI 2.2.0

Knour13 · August 23, 2024, 12:20pm

Same error

Knour13 · August 23, 2024, 12:42pm

This is cloud watch

Topic		Replies	Views
Error hosting endpoint when deploying model in sagemaker Models	0	90	July 20, 2024
Error hosting endpoint when deploying model Amazon SageMaker	2	3038	March 27, 2024
Deploying TheBloke/Luna-AI-Llama2-Uncensored-GGML Amazon SageMaker	0	844	September 11, 2023
Need help deploying a HF model to AWS Sagemaker Amazon SageMaker	3	157	September 27, 2024
Error loading finetuned llama2 model while running inference Amazon SageMaker	27	4805	September 20, 2023

Error deploying endpoint on Aws

Hub Model configuration. https://huggingface.co/models

create Hugging Face Model Class

deploy model to SageMaker Inference

send request

Related topics