Inference Toolkit - custom inference with multiple models

nilpferd · February 17, 2024, 9:44pm

Hello,

I’m trying to perform custom inference, where I need to a model and a tokenizer hosted at 2 different repositories on HuggingFace. I have looked at Sample customer inference notebook, however it only uses a single model.

Similarly, the code summarization notebook also uses a tokenizer and model from the same directory. I want to implement something similar, but with model and tokenizer hosted at 2 different HuggingFace repositories.

If I am to load a model and a tokenizer hosted at 2 different HuggingFace repos, zip them to a tar.gz file and finally push them to s3, what should the directory structure of this tar.gz be? I have tried the following with no success:

model.tar.gz
    /model1
        model_config.json (along with other model files)
       /code
           inference.py
    /tokenizer
        tokenizer_config.json

my custom model loading function looks like:

def model_fn(model_dir):
    model = AutoModel.from_pretrained(f"{model_dir}/model1")
    tokenizer= T5Tokenizer.from_pretrained(f'{model_dir}/tokenizer')
    return model, tokenizer

And my model creation function looks like:

from sagemaker.huggingface.model import HuggingFaceModel

huggingface_model = HuggingFaceModel(entry_point='inference.py',
                                        source_dir='code',
                                        model_data=s3_location,
                                        role=role,
                                        pytorch_version='1.7.1',
                                        py_version='py36',
                                        transformers_version='4.6.1'
                                    )

The error I’m seeing is:

OSError: file /.sagemaker/mms/models/model/config.json not found.

I’d appreciate some help figuring out if my directory structure is incorrect, and how should it be. If there are better ways to achieve this task, please suggest.

akshat-kumar-akight · April 4, 2024, 2:52pm

as per my understanding your “code” folder should be outside where it is currently. Also, you need to download a model snapshot, place “code” folder in it, place other files that your tokenizer might require and .tar.gz them.

Topic		Replies	Views
Loading inference.py separately from model.tar.gz Amazon SageMaker	4	1861	June 5, 2023
Inference Toolkit - Init and default template for custom inference Amazon SageMaker	12	2137	October 4, 2021
SageMaker Inference for Model Tuned Elsewhere Amazon SageMaker	4	1070	September 2, 2021
HuggingFaceModel ignores code directory Amazon SageMaker	2	15	June 17, 2025
How to Create Model in SageMaker Console from .tar.gz Amazon SageMaker	7	10361	March 10, 2022

Inference Toolkit - custom inference with multiple models

Related topics