🤗 LLM Inference Container for SageMaker

Hi all,

I’ve followed through the recent blog post on the new Inference Container and have been able to use it to deploy Open Assistant 12b.

I’d like to try and use the same approach to test out the other supported models listed in the post, but when I try tweaking the model settings in my code I can never get SageMaker to successfully deploy an endpoint.

Does anyone have a tried and tested set of examples for deploying different models using the new Inference Container?


I’ve since tried bloomz-560m and bloomz-1b1 and they deployed successfully.

I’m guessing there’s a point at which the models are just too big to deploy on SageMaker? Would using a larger instance type help in those cases?