Infer with SageMaker for a Private Model

Hello,

I have a private model I wish to use for inference using Amazon SageMaker. I’ve found the documentation and code snippets are great for public models but for private models I’m just provided with the same code snippet to use which doesn’t have any reference to my authentication tokens and so when I try running it I get a 404 error on the generated instance in SageMaker. Does anyone know how to add to this code to make it work for private models?

from sagemaker.huggingface import HuggingFaceModel
import sagemaker

role = sagemaker.get_execution_role()
# Hub Model configuration. https://huggingface.co/models
hub = {
	'HF_MODEL_ID':'private_model',
	'HF_TASK':'text2text-generation'
}

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
	transformers_version='4.6.1',
	pytorch_version='1.7.1',
	py_version='py36',
	env=hub,
	role=role, 
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
	initial_instance_count=1, # number of instances
	instance_type='ml.m5.xlarge' # ec2 instance type
)

predictor.predict({
	'inputs': "The answer to the universe is"
})

Thanks so much,

Karim

Hey @kmfoda,

If you want to use private models you also need to define HF_API_TOKEN in the hub dictionary. More here GitHub - aws/sagemaker-huggingface-inference-toolkit.

1 Like

Perfect. Thank you very much @philschmid.

1 Like

This was kind of hard to track down and not mentioned in any docs. Can you guys add the HF_API_TOKEN in the script that appears when you do “Deploy” → “Amazon Sagemaker” → “AWS”, for private models? Would’ve saved me a lot of time.

2 Likes