Sagemaker serverless endpoint deployment error (Image size greater than support size))

Halo guys,

I experienced UnexpectedStatusException when trying deploying a model to serverless endpoint.

huggingface_model = HuggingFaceModel(
    model_data="s3://path/model.tar.gz",
    role=role,
    py_version="py310",
    transformers_version="4.28.1",
    pytorch_version="2.0.0",
)
serverless_config = ServerlessInferenceConfig(memory_size_in_mb=4096,
                                              max_concurrency=10,
                                              )
predictor = huggingface_model.deploy(serverless_inference_config = serverless_config)

The error message is:

UnexpectedStatusException: Error hosting endpoint huggingface-pytorch-inference-2023-07-21-03-29-12-789: Failed. Reason: Image size 13884113136 is greater than supported size 10737418240.

Seems the Image size is larger than the image limit 10GB

Yes. Whats your question?

Hi @philschmid,

I just want to know if this Huggingface DLC can be deployed in serverless endpoint.

And, my current model size is 2.2GB which is uploaded to S3, does it account for the image space which led the image size over 10GB limit.

Thanks.

Yes see: Serverless Inference with Hugging Face's Transformers, DistilBERT and Amazon SageMaker