Inference Toolkit - custom inference with multiple models

Hello,

I’m trying to perform custom inference, where I need to a model and a tokenizer hosted at 2 different repositories on HuggingFace. I have looked at Sample customer inference notebook, however it only uses a single model.

Similarly, the code summarization notebook also uses a tokenizer and model from the same directory. I want to implement something similar, but with model and tokenizer hosted at 2 different HuggingFace repositories.

If I am to load a model and a tokenizer hosted at 2 different HuggingFace repos, zip them to a tar.gz file and finally push them to s3, what should the directory structure of this tar.gz be? I have tried the following with no success:

model.tar.gz
    /model1
        model_config.json (along with other model files)
       /code
           inference.py
    /tokenizer
        tokenizer_config.json

my custom model loading function looks like:

def model_fn(model_dir):
    model = AutoModel.from_pretrained(f"{model_dir}/model1")
    tokenizer= T5Tokenizer.from_pretrained(f'{model_dir}/tokenizer')
    return model, tokenizer

And my model creation function looks like:

from sagemaker.huggingface.model import HuggingFaceModel

huggingface_model = HuggingFaceModel(entry_point='inference.py',
                                        source_dir='code',
                                        model_data=s3_location,
                                        role=role,
                                        pytorch_version='1.7.1',
                                        py_version='py36',
                                        transformers_version='4.6.1'
                                    )

The error I’m seeing is:

OSError: file /.sagemaker/mms/models/model/config.json not found.

I’d appreciate some help figuring out if my directory structure is incorrect, and how should it be. If there are better ways to achieve this task, please suggest.

as per my understanding your “code” folder should be outside where it is currently. Also, you need to download a model snapshot, place “code” folder in it, place other files that your tokenizer might require and .tar.gz them.