How to deploy a T5 model to AWS SageMaker for fast inference?

  1. Upload the ONNX T5 base model to S3 in AWS

You can use for example boto3 or the cli to upload files to S3 and you can find documentation on how the create the model.tar.gz here: Deploy models to Amazon SageMaker

  1. Use the ONNX T5 base model in AWS SageMaker DLC in order to make inferences

Currently, there is no example for using ONNX in SageMaker with the HF DLC, but you would need to create a custom inference.py as documented here: Deploy models to Amazon SageMaker and add the ONNX dependencies in a requirements.txt package everything and upload it to S3.

1 Like