Inference Toolkit - Init and default template for custom inference

philschmid · October 4, 2021, 6:25am

Hey @ujjirox thank you for your detailed response. I am trying to recreate it and provide an example that works.
But why are you wanting to use a customer inference.py from looking at your code it seems you are not doing something special. You should be able to deploy your model and with providing a HF_TASK:"summarization" with it.

like that and remove the inference.py from you archive.

from sagemaker.huggingface import HuggingFaceModel
import sagemaker

model_name = 'model1'
endpoint_name = 'endpoint1'

# Hub Model configuration. https://huggingface.co/models
hub = {
	'HF_TASK':'summarization'
}

role = sagemaker.get_execution_role()

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   model_data="s3://call-summarization/model1.tar.gz",  
   role=role,
   transformers_version="4.6.1", 
   pytorch_version="1.7.1",
   env=hub,
   py_version='py36',
   name=model_name
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
	initial_instance_count=1, 
	instance_type='ml.g4dn.xlarge',
    endpoint_name = endpoint_name, 
)

Topic		Replies	Views
Loading inference.py separately from model.tar.gz Amazon SageMaker	4	1862	June 5, 2023
Help for inference.py code Amazon SageMaker	10	4003	March 8, 2022
HuggingFaceModel ignores code directory Amazon SageMaker	2	16	June 17, 2025
Install custom python libraries in HuggingFace DLC Amazon SageMaker	9	2003	August 8, 2022
Inference Toolkit - custom inference with multiple models Amazon SageMaker	1	635	April 4, 2024

Inference Toolkit - Init and default template for custom inference

Related topics