Deploy customized SpaCy NER transformer model to SageMaker

Hi ,
I am deploying customized SpaCy NER transformer model to SageMaker. I got the inference errors…

odelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
  "code": 400,
  "type": "InternalServerException",
  "message": "Can\u0027t load config for \u0027/.sagemaker/mms/models/jinhybr__en_SEDNA_NER_MARTIME\u0027. Make sure that:\n\n- \u0027/.sagemaker/mms/models/jinhybr__en_SEDNA_NER_MARTIME\u0027 is a correct model identifier listed on \u0027https://huggingface.co/models\u0027\n  (make sure \u0027/.sagemaker/mms/models/jinhybr__en_SEDNA_NER_MARTIME\u0027 is not a path to a local directory with something else, in that case)\n\n- or \u0027/.sagemaker/mms/models/jinhybr__en_SEDNA_NER_MARTIME\u0027 is the correct path to a directory containing a config.json file\n\n"
}

Here is my deploy code:

from sagemaker.huggingface import HuggingFaceModel
import sagemaker 

role = sagemaker.get_execution_role()


# Hub Model configuration. https://huggingface.co/models
hub = {
  'HF_MODEL_ID':'jinhybr/en_SEDNA_NER_MARTIME', # model_id from hf.co/models
  'HF_TASK':'token-classification' # NLP task you want to use for predictions
}



# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   env=hub,
   role=role, # iam role with permissions to create an Endpoint
   transformers_version="4.12.3", # transformers version used
   pytorch_version="1.9.1", # pytorch version used
   py_version="py38", # python version of the DLC
)
predictor = huggingface_model.deploy(
   initial_instance_count=1,
   instance_type="ml.m5.xlarge"
)
data = {
   "inputs": "Camera - You are awarded a SiPix Digital Camera! call 09061221066 fromm landline. Delivery within 28 days."
}

# request
predictor.predict(data)

I also try to deploy as zip from S3, same errors…here is s3 code.



huggingface_model = HuggingFaceModel(
   model_data="s3://hugg-ner-model/model.tar.gz",  # path to your trained SageMaker model
   role=role,                                            # IAM role with permissions to create an endpoint
   transformers_version="4.12.3",                           # Transformers version used
   pytorch_version="1.9.1",                                # PyTorch version used
   py_version='py38',                                    # Python version used
    env={ 'HF_TASK':'token-classification' },
)
## I use tar zcvf model.tar.gz * from local repo and cp to s3

Really appreciated for your helps!

Hey @jinhybr,

The zero-code deployment through the hub configuration currently only works for transformers models and sadly not for spacy. To use spacy you would need to create a custom inference.py and requirements.txt. Similar to this example: Creating document embeddings with Hugging Face's Transformers & Amazon SageMaker

Thanks so much! I finally use the Docker Image /Lambda/API to deploy the customized Spacy model. It works