Hello @jxiao,
you can check out this blog post on how to compile and deploy models to Inferentia: Accelerate BERT inference with Hugging Face Transformers and AWS Inferentia
Hello @jxiao,
you can check out this blog post on how to compile and deploy models to Inferentia: Accelerate BERT inference with Hugging Face Transformers and AWS Inferentia