Deploying Huggingface Sagemaker Models with Elastic Inference

Vinayaks117 · June 22, 2022, 3:17pm

We are interested in cost effective solution and also interested in hosting multiple models in one container.
But I think we can not host multiple models in one container behind one endpoint with both elastic inference and Inferentia but it’s possible with only cpu based instances. Thanks

Topic		Replies	Views
Inference failed for FLAN-UL2(20B) on SageMaker Amazon SageMaker	6	2166	April 4, 2023
About the Amazon SageMaker category Amazon SageMaker	25	4102	August 5, 2021
Deploying TheBloke/Luna-AI-Llama2-Uncensored-GGML Amazon SageMaker	0	844	September 11, 2023
Emotion Model: Additional inference parameter not processed in Sagemaker inferentia instance Amazon SageMaker	1	278	July 17, 2023
How do I deploy a hub model to SageMaker and give it a GPU (not Elastic Inference)? Amazon SageMaker	4	3380	February 15, 2022

Deploying Huggingface Sagemaker Models with Elastic Inference

Related topics