🤗 LLM Inference Container for SageMaker

mrwadams · June 7, 2023, 1:19pm

Hi all,

I’ve followed through the recent blog post on the new Inference Container and have been able to use it to deploy Open Assistant 12b.

I’d like to try and use the same approach to test out the other supported models listed in the post, but when I try tweaking the model settings in my code I can never get SageMaker to successfully deploy an endpoint.

Does anyone have a tried and tested set of examples for deploying different models using the new Inference Container?

Thanks

mrwadams · June 7, 2023, 2:09pm

I’ve since tried bloomz-560m and bloomz-1b1 and they deployed successfully.

I’m guessing there’s a point at which the models are just too big to deploy on SageMaker? Would using a larger instance type help in those cases?

Topic		Replies	Views
Sagemaker parameters via AWS client Amazon SageMaker	2	683	June 27, 2023
LLM Inference hosting issue Intermediate	2	393	December 4, 2023
Comparing Inference Instances for Text Embedding and Completion Tasks Intermediate	1	338	May 23, 2023
Deploy ONXX model to Sagemaker Amazon SageMaker	6	2963	April 18, 2024
Sagemaker Serverless Inference Amazon SageMaker	22	8996	May 22, 2024

🤗 LLM Inference Container for SageMaker

Related topics