I’m trying to deploy a private HF model to AWS Sagemaker by executing the following code locally:
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel
from huggingface_hub import login
from dotenv import load_dotenv
import os
# Load environment variables
load_dotenv()
# Get the access token
token = os.environ.get("HF_LOGIN", None)
login(token)
hub = {
"HF_MODEL_ID": "Org/model-name",
"HF_TASK": "text-classification",
"HUGGING_FACE_HUB_TOKEN": token
}
iam = boto3.client('iam')
role = iam.get_role(RoleName="AmazonSageMaker-ExecutionRole")["Role"]["Arn"]
session = sagemaker.Session(boto3.Session(region_name="ap-southeast-2"))
huggingface_model = HuggingFaceModel(
env=hub,
role=role, # The IAM role with necessary permissions
transformers_version="4.17", # Use the appropriate transformers version
pytorch_version="1.10", # Use the appropriate PyTorch version
py_version="py38", # Python version
sagemaker_session=session
)
predictor = huggingface_model.deploy(
initial_instance_count=1, # Number of instances (e.g., 1 for small load)
instance_type="ml.m5.xlarge" # Choose the instance type based on the resources required
)
However, the deploy()
method call just keeps executing and never finishes. I’m not sure why this is happening. Any advice would be greatly appreciated.
Thanks!