Inference Toolkit - Init and default template for custom inference

Hey @ujjirox thank you for your detailed response. I am trying to recreate it and provide an example that works.
But why are you wanting to use a customer inference.py from looking at your code it seems you are not doing something special. You should be able to deploy your model and with providing a HF_TASK:"summarization" with it.

like that and remove the inference.py from you archive.

from sagemaker.huggingface import HuggingFaceModel
import sagemaker

model_name = 'model1'
endpoint_name = 'endpoint1'

# Hub Model configuration. https://huggingface.co/models
hub = {
	'HF_TASK':'summarization'
}

role = sagemaker.get_execution_role()

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   model_data="s3://call-summarization/model1.tar.gz",  
   role=role,
   transformers_version="4.6.1", 
   pytorch_version="1.7.1",
   env=hub,
   py_version='py36',
   name=model_name
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
	initial_instance_count=1, 
	instance_type='ml.g4dn.xlarge',
    endpoint_name = endpoint_name, 
)