How to deploy a T5 model to AWS SageMaker for fast inference?

No SageMaker is not doing any compression on some sort of today.

Are you installing transformers==4.15 through a requirement.txt within SageMaker?