How to Create Model in SageMaker Console from .tar.gz

@philschmid @YannAgora , thanks for your help! I just want to give you a short update regarding the case and maybe the following is also relevant for everyone who encounters the problem or wants to set up serverless inference via the SageMaker console :slight_smile:

TL;DR

  • Endpoint returns prediction
  • BUT error still visible in logs python: can't open file '/usr/local/bin/deep_learning_container.py': [Errno 13] Permission denied
  • Re-build the endpoint configuration with max. ram (6GB) magically resolved the error
  • AWS support is aware of the premission issue and still investigating a fix

I had a call with AWS Support and showed them the issue. We manually deleted the endpoint, the endpoint configuration and the model.

Then we did the following via the aws sagemaker console manually:

  1. Create model from .tar.giz file on S3 || Amazon SageMaker → Models → Create model
    1.1 Make sure assigned IAM Role has AmazonSageMakerFullAccess IAM policy attached.
    1.2 Select “Provide model artifacts and inference image location”
    1.3 Select “Use a single model”
    1.4 Location of inference code image. We used a CPU-only pre-defined HF AWS Image for inference. Replace the region with eu-west-1, like 763104351884.dkr.ecr.eu-west-1.amazonaws.com/huggingface-pytorch-inference:1.9.1-transformers4.12.3-cpu-py38-ubuntu20.04
    1.5 Location of model artifacts: Copy S3 URI to the .tar.gz.
    1.6 We left all other settings untouched and saved the model.

  2. Create endpoint configuration || Amazon SageMaker → Endpoint configuration → Create endpoint configuration
    2.1 Type of endpoint: Serverless
    2.1 Production variants → Click “Add Model”, select the model you created during step 1 and save.
    2.2 Back in the main setup, click “Edit” next to the selected model and assing 6GB of Memory Size to the model (I also set Max Concurrency to 1 but not sure about that), hit save
    2.3 Save the endpoint config by clicking “Create endpoint configuration”

  3. Create endpoint || Amazon SageMaker → Endpoints → Create and configure endpoint
    3.1 Name the endpoint
    3.2 Select “Use an existing endpoint configuration”
    3.3 Select the endpoint configuration and click “Select endpoint configuration”
    3.3 Click “Create endpoint”

It will take a few minutes to create the endpoint. After that you can open the endpoint and copy the invocation url which looks like https://runtime.sagemaker.eu-west-1.amazonaws.com/endpoints/endpoint-name/invocations. Then you can send POST requests to the endpoint. The body of the requests has the format {"inputs":"Your Text"} and the endpoint will return something like:

[
    {
        "label": "LABEL_27",
        "score": 0.7524298429489136
    }
]

Hope that helps!

2 Likes