Hi, I’m trying to deploy a serverless endpoint from model_data. Trying to do it in the same manner I deployed a similar model to an EC2 instance, but it seems to fail.
I do
huggingface_model = HuggingFaceModel(**model_params)
where
model_params = {‘role’: <exec_role>, ‘transformers_version’: ‘4.6’, ‘sagemaker_session’: <sagemaker.session.Session object at 0x158528e50>, ‘pytorch_version’: ‘1.7’, ‘py_version’: ‘py36’, ‘model_data’: <path_to_S3>}
then
serverless_config = ServerlessInferenceConfig(
memory_size_in_mb=memory_size_in_mb, max_concurrency=max_concurrency
)
huggingface_model.deploy(
serverless_inference_config=serverless_config, endpoint_name=model_name, wait=wait)
All seems to deploy well, then when I run, i’m getting:
“(“You need to define one of the following [\u0027feature-extraction\u0027, \u0027text-classification\u0027, \u0027token-classification\u0027, \u0027question-answering\u0027, \u0027table-question-answering\u0027, \u0027fill-mask\u0027, \u0027summarization\u0027, \u0027translation\u0027, \u0027text2text-generation\u0027, \u0027text-generation\u0027, \u0027zero-shot-classification\u0027, \u0027conversational\u0027, \u0027image-classification\u0027] as env \u0027TASK\u0027.”, 403)”
I tried adding env={“HF_TASK”: “feature-extraction”} to the model creation, but i then get an error (which makes sense, since i’m not really specifying a model from the hub)
“Can\u0027t load config for \u0027/.sagemaker/mms/models/model\u0027. Make sure that:\n\n- \u0027/.sagemaker/mms/models/model\u0027 is a correct model identifier listed on \u0027https://huggingface.co/models\u0027\n\n- or \u0027/.sagemaker/mms/models/model\u0027 is the correct path to a directory containing a config.json file\n\n”
}
Anyone has some idea that can help?
Thank you,
Alex