mahmutc
September 27, 2024, 8:59am
2
hi @bhashwarsengupta
I’m not familiar with AWS Sagemaker but let me share this. I hope checking /aws/sagemaker/Endpoints/<your_endpoint_name> gives more details.
opened 04:26PM - 08 Mar 18 UTC
closed 09:46PM - 19 Mar 18 UTC
I am trying to deploy a BYOB (bring your own model) keras model. I pushed the im… age to ECR with the 'latest' tag. All local testing passed, and I am able to successfully train the model e.g.:
```python
image = '{}.dkr.ecr.{}.amazonaws.com/my-model:latest'.format(account, region)
dl = sage.estimator.Estimator(image,
role, 1, 'ml.c4.2xlarge',
output_path="s3://{}/output".format(sess.default_bucket()),
sagemaker_session=sess)
```
However attempting to deploy gives me the error:
```
Failed Reason: The primary container for production variant AllTraffic did not pass the ping health check.
```
I am not quite sure where this stems from given local health check passed. Any insight would be great! Thanks.
1 Like