Cannot invoke sagemaker endpoint, keep getting OS error

darklee · April 15, 2023, 9:26am

Hi,

Could any experts on this topic give me a hand?

I keep getting this error:


An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{

"code": 400,

"type": "InternalServerException",

"message": "Can\u0027t load config for \u0027/.sagemaker/mms/models/model\u0027. If you were trying to load it from \u0027[https://huggingface.co/models\u0027](https://huggingface.co/models/u0027), make sure you don\u0027t have a local directory with the same name. Otherwise, make sure \u0027/.sagemaker/mms/models/model\u0027 is the correct path to a directory containing a config.json file"

}

When i go into the logs, I see:

File "/opt/conda/lib/python3.8/site-packages/transformers/file_utils.py", line 1936, in cached_path raise EnvironmentError(f"file {url_or_filename} not found"), OSError: file /.sagemaker/mms/models/model/config.json not found

I do not use Huggingface public model but used my own model trained in sagemaker notebook and stored in S3 bucket. When I deploy the endpoint, it’s successful. The problem comes only when I invoke the endpoint.

I have it working in one dev environment but when I follow the same setup for prod env, the endpoint is not working.

What I have done to debug:

check all roles, policies, S3 putObject, readObject,… My created role allows me to deploy it successfully so I have no idea.
check that model.tar.gz once unzipped contain config.json,…
check the network.

Background:

I finetuned a bert based model from huggingface and deployed it using sagemaker.huggingface.model import HuggingFaceModel deploy function, all works well in dev, but having the mentioned issue in prod env. I tested the deployment even with/without VPC, but it is not working yet.

Code i use:

huggingface_model = HuggingFaceModel(
model_data=model_data, # path to your trained SageMaker model
role=role, # IAM role with permissions to create an endpoint
transformers_version="4.17.0", # Transformers version used
pytorch_version="1.10.2", # PyTorch version used
py_version="py38", # Python version used
env={
"HF_TASK": "token-classification",
"SAGEMAKER_CONTAINER_LOG_LEVEL": xxx,
"SAGEMAKER_REGION": "xxx",
},
# vpc_config=vpc_config, # Specify VPC settings
)
# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type="ml.g4dn.xlarge",
)
```

philschmid · April 17, 2023, 7:08am

how have your created the model.tar.gz? The error indicates that the structure of it is wrong. Check the documentation: Deploy models to Amazon SageMaker

darklee · April 21, 2023, 4:50am

Hi,

You’re right. I went through your mentioned documentation and did exactly what it said. I later figured out that the problem is about compressing the gz file. Even though I used the same gz file for both dev and prod environments, What I had to do is unzip and zip again the same content and reupload to S3. Eventually this works.

Mit1208 · February 2, 2024, 7:57pm

hey @darklee, thanks that you mentioned about compression issue otherwise I wouldn’t figure out issue. I fine tuned model and was creating gz file direct from the folder and it was throwing an error at the time of inference. I converted folder into zip file and then converted into gz file and then it worked.

Topic		Replies	Views
Getting error in the inference stage of Transformers Model (Hugging Face) 🤗Transformers	0	782	October 11, 2022
Error: Could Not Load Model Amazon SageMaker	7	6645	March 11, 2022
Deploy customized SpaCy NER transformer model to SageMaker Amazon SageMaker	2	2174	August 3, 2022
InternalServerException from bart model created from s3 Amazon SageMaker	1	389	May 22, 2023
Calling Sagemaker Endpoint for fine-tuned summarization model Amazon SageMaker	15	5068	March 22, 2024

Cannot invoke sagemaker endpoint, keep getting OS error

Related topics