Multi-Model Endpoint with Hugging Face

Hi Team,

Good day!!

I’m trying to deploy multiple BERT models in one container behind one endpoint using the boto3 API.

If we unzip the model.tar.gz file then we have inference code, model artifacts in below structure.

image

I observed that dependency libraries were not being installed when SageMaker started setting up the container in cloud watch logs.

Hence I’m getting the below error.

ModuleNotFoundError: No module named ‘spacy’

Note: I followed the above structure to deploy a single model endpoint and it works fine.

Looks like there is a different structure we need to follow in case of multi-model endpoint deployment.

Could you please help me with this issue?

Thanks

Can you please provide more information on how you create the zip file and how you are deploying the endpoint? Also what is the content of your requirements.txt? it looks like you want to use spacy in the inference script but it is not installed

As requested, please find the details below.

  1. Creating zip file: As shown in above image, “model” directory contains model artifacts and “code” directory contains inference code and requirements.txt file.

image

  1. Contents of requirements.txt file

spacy==3.2.1
spacy-alignments==0.8.4
spacy-legacy==3.0.8
spacy-loggers==1.0.1
spacy-transformers==1.1.4
tokenizers==0.10.3
pytextspan==0.5.4
validators==0.18.2

  1. Deployment:

image

image

Could you try to create the model.tar.gz following the steps in this notebook? notebooks/sagemaker-notebook.ipynb at main · huggingface/notebooks · GitHub

the tar.add(model_path) is creating a wrong structure with a nested model/ directory but the artifacts need to be on top level.

Hello @philschmid

Issue was not with saving model artifacts on the top level.

Solution: To install additional libraries on the container, libraries which are part of requirements.txt text file needs to be installed using pip in inference script. Within the archive, the HuggingFace container expects all inference code to be inside the code/ directory.

I did not find any article on multi-model endpoint with hugging face and examples in “aws-samples” git repo hence I publish this article on Medium.

Hope that helps everyone. Thanks

2 Likes

Vinayaks117 Medium article link: Multi-Model Endpoints with Hugging Face Transformers and Amazon SageMaker | by Vinayak Shanawad | Analytics Vidhya | Medium