Deploying Huggingface Sagemaker Models with Elastic Inference

Hello @jxiao,

you can check out this blog post on how to compile and deploy models to Inferentia: Accelerate BERT inference with Hugging Face Transformers and AWS Inferentia

1 Like