Deploy ONXX model to Sagemaker

Hi,

I am wondering if anyone has an example of how to deploy an ONXX converted model to Sagemaker.

What is the procedure to create a model artifact for deployment?

And is there anything changing when deploying the model to Sagemaker?

Thanks

Hey @kamneb,

Sadly there is no example yet for it. I hope i can create on soon.

The process would be similar to this example. Except that you would need to upload you onnx model and create a requirements.txt including onnxruntime/optimum and write a infernece.py

Thanks @philschmid. I just changed the path to the onxx model in the inference.py file and I add the dependency in the requirements.txt file. it works well.

1 Like

@kamneb can you share your requirements.txt and your inference.py?

a little confused how you load the onnx runtime as the HuggingFace container expects an transformers automodel object.

I made a blog post walking through how to do this for a simple case here. Hopefully that can help streamline it for any folks trying to do this in the future : )

@nbertagnolli Where’s the blog? The ‘here’ link does not take us there.

Oh no sorry! Does this one work? Deploy an ONNX Transformer to Sagemaker.