Multi-Model Endpoint with Hugging Face

Hi Team,

Good day!!

I’m trying to deploy multiple BERT models in one container behind one endpoint using the boto3 API.

If we unzip the model.tar.gz file then we have inference code, model artifacts in below structure.

image

I observed that dependency libraries were not being installed when SageMaker started setting up the container in cloud watch logs.

Hence I’m getting the below error.

ModuleNotFoundError: No module named ‘spacy’

Note: I followed the above structure to deploy a single model endpoint and it works fine.

Looks like there is a different structure we need to follow in case of multi-model endpoint deployment.

Could you please help me with this issue?

Thanks

Can you please provide more information on how you create the zip file and how you are deploying the endpoint? Also what is the content of your requirements.txt? it looks like you want to use spacy in the inference script but it is not installed

As requested, please find the details below.

  1. Creating zip file: As shown in above image, “model” directory contains model artifacts and “code” directory contains inference code and requirements.txt file.

image

  1. Contents of requirements.txt file

spacy==3.2.1
spacy-alignments==0.8.4
spacy-legacy==3.0.8
spacy-loggers==1.0.1
spacy-transformers==1.1.4
tokenizers==0.10.3
pytextspan==0.5.4
validators==0.18.2

  1. Deployment:

image

image

Could you try to create the model.tar.gz following the steps in this notebook? notebooks/sagemaker-notebook.ipynb at main · huggingface/notebooks · GitHub

the tar.add(model_path) is creating a wrong structure with a nested model/ directory but the artifacts need to be on top level.