Hi HuggingFace community,
I’m attempting to deploy a fine-tuned T5 model for summarization using a SageMaker Endpoint. The endpoint is deployed successfully with the following code:
from sagemaker.huggingface.model import HuggingFaceModel
huggingface_model = HuggingFaceModel(
model_data="s3://my-s3-path/model.tar.gz",
role=role,
transformers_version="4.6",
pytorch_version="1.7",
py_version='py36',
)
predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type="ml.m5.xlarge",
endpoint_name="my-endpoint-name"
)
I then try to call the endpoint:
from sagemaker.huggingface.model import HuggingFacePredictor
predictor = HuggingFacePredictor(endpoint_name="my-endpoint-name", sagemaker_session=sess)
predictor.predict({'inputs': 'this is a string',
'parameters': {'max_length': 20,
'min_length': 1}
})
And I get the following error:
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from model with message "{
"code": 400,
"type": "InternalServerException",
"message": "(\"You need to define one of the following [\u0027feature-extraction\u0027, \u0027text-classification\u0027, \u0027token-classification\u0027, \u0027question-answering\u0027, \u0027table-question-answering\u0027, \u0027fill-mask\u0027, \u0027summarization\u0027, \u0027translation\u0027, \u0027text2text-generation\u0027, \u0027text-generation\u0027, \u0027zero-shot-classification\u0027, \u0027conversational\u0027, \u0027image-classification\u0027] as env \u0027TASK\u0027.\", 403)"
}
Can anyone tell me where I should define that I want a summarization model? I can’t find anything in the docs to tell me this. Thanks!