Serverless deploy troubles

Hi Alex

The way you instantiate the HuggingFaceModel class looks a bit unusual to me. I usually go about it this way:

huggingface_model = HuggingFaceModel(
   model_data="s3://hf-sagemaker-inference/model.tar.gz",  # path to your trained sagemaker model
   role=role, # iam role with permissions to create an Endpoint
   transformers_version="4.17", # transformers version used
   pytorch_version="1.10", # pytorch version used
   py_version="py38", # python version of the DLC
)

In terms of serverless deployment, you seem to do everything right, as far as I can tell. Just make sure you use the latest DLC (i.e. specify the latest supported Transformers and Pytorch versions, 4.17 and 1.10 respectively, in this case). You can find the latest versions here: Reference

You can also check out these two sample notebooks and mix and match to fit your use case:

Hope that helps!

Cheers
Heiko