Need help deploying a HF model to AWS Sagemaker

bhashwarsengupta · September 26, 2024, 2:55pm

I’m trying to deploy a private HF model to AWS Sagemaker by executing the following code locally:

import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel
from huggingface_hub import login
from dotenv import load_dotenv
import os

# Load environment variables
load_dotenv()

# Get the access token
token = os.environ.get("HF_LOGIN", None)

login(token)

hub = {
    "HF_MODEL_ID": "Org/model-name",
    "HF_TASK": "text-classification",
    "HUGGING_FACE_HUB_TOKEN": token
}

iam = boto3.client('iam')
role = iam.get_role(RoleName="AmazonSageMaker-ExecutionRole")["Role"]["Arn"]

session = sagemaker.Session(boto3.Session(region_name="ap-southeast-2"))

huggingface_model = HuggingFaceModel(
    env=hub,
    role=role,  # The IAM role with necessary permissions
    transformers_version="4.17",  # Use the appropriate transformers version
    pytorch_version="1.10",       # Use the appropriate PyTorch version
    py_version="py38",            # Python version
    sagemaker_session=session
)

predictor = huggingface_model.deploy(
    initial_instance_count=1,     # Number of instances (e.g., 1 for small load)
    instance_type="ml.m5.xlarge"   # Choose the instance type based on the resources required
)

However, the deploy() method call just keeps executing and never finishes. I’m not sure why this is happening. Any advice would be greatly appreciated.

Thanks!

mahmutc · September 27, 2024, 8:59am

hi @bhashwarsengupta
I’m not familiar with AWS Sagemaker but let me share this. I hope checking /aws/sagemaker/Endpoints/<your_endpoint_name> gives more details.

github.com/aws/amazon-sagemaker-examples

Failed Reason: The primary container for production variant AllTraffic did not pass the ping health check.

opened 04:26PM - 08 Mar 18 UTC

closed 09:46PM - 19 Mar 18 UTC

dtsukiyama

I am trying to deploy a BYOB (bring your own model) keras model. I pushed the im…age to ECR with the 'latest' tag. All local testing passed, and I am able to successfully train the model e.g.: ```python image = '{}.dkr.ecr.{}.amazonaws.com/my-model:latest'.format(account, region) dl = sage.estimator.Estimator(image, role, 1, 'ml.c4.2xlarge', output_path="s3://{}/output".format(sess.default_bucket()), sagemaker_session=sess) ``` However attempting to deploy gives me the error: ``` Failed Reason: The primary container for production variant AllTraffic did not pass the ping health check. ``` I am not quite sure where this stems from given local health check passed. Any insight would be great! Thanks.

bhashwarsengupta · September 27, 2024, 9:20am

Thanks @mahmutc for sharing. I actually had figured out the problem. The deploy() method call was trying to find a file named .sagemaker-code-config throughout my filesystem. All I did was to create the same file in the cwd and it was resolved.

system · September 27, 2024, 9:20pm

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Can't deploy conversational HF model on AWS - Logs say model-path not a valid directory Amazon SageMaker	4	1607	January 13, 2022
Vicuan error on Sagemaker Amazon SageMaker	3	829	October 23, 2024
Infer with SageMaker for a Private Model Amazon SageMaker	3	2422	June 30, 2022
Deploy big model to AWS Sagemaker fails Beginners	5	1079	July 31, 2023
About the Amazon SageMaker category Amazon SageMaker	25	4102	August 5, 2021

Need help deploying a HF model to AWS Sagemaker

Related topics