Inference Toolkit - Init and default template for custom inference

philschmid · October 4, 2021, 6:44am

Okay got it! I ll come back with a working example/steps to you.

No, it wouldn’t when providing a custom inference.py the hub config should be ignored except you are not overwriting the model_fn.

You need to use model_fn and not load_fn.

Could you test instead of providing the inference.py in the model.tar.gz to provide it dynamically when creating the endpoint?

like that

    hf_model = HuggingFaceModel(
        model_data="s3://call-summarization/model1.tar.gz",  
        role=role,
        transformers_version="4.6.1", 
        pytorch_version="1.7.1",
        source_dir="code",
        py_version="py36",
        entry_point="inference.py",
    )

For this to work the file structure needs to be

code/
     inference.py
deploy.py

Topic		Replies	Views
Loading inference.py separately from model.tar.gz Amazon SageMaker	4	1862	June 5, 2023
Help for inference.py code Amazon SageMaker	10	4003	March 8, 2022
HuggingFaceModel ignores code directory Amazon SageMaker	2	16	June 17, 2025
Install custom python libraries in HuggingFace DLC Amazon SageMaker	9	2003	August 8, 2022
Inference Toolkit - custom inference with multiple models Amazon SageMaker	1	635	April 4, 2024

Inference Toolkit - Init and default template for custom inference

Related topics