Using S3 as model cache for Huggingface LLM inference DLC on Sagemaker

nth-attempt · June 14, 2023, 7:12pm

Hi,

Is it possible to use the Huggingface LLM inference container for Sagemaker (Introducing the Hugging Face LLM Inference Container for Amazon SageMaker) in a way that I can specify path to a S3 bucket where I have the models downloaded ready for use instead of downloading the models from internet. Essentially using the S3 path as a HF_HUB cache or using the S3 path to download the models on to the local container.

This is useful in the cases

where we can’t connect to internet
when we have fine-tuned models stored on S3

Thank you!

philschmid · June 21, 2023, 3:36pm

We release a blog post on how to do this: Securely deploy LLMs inside VPCs with Hugging Face and Amazon SageMaker

Topic		Replies	Views
SageMaker Pipeline from model saved on S3 Amazon SageMaker	1	1193	September 9, 2022
InternalServerException when running a model loaded on S3 Amazon SageMaker	4	1003	August 6, 2021
Directly load models from a remote storage like S3 Amazon SageMaker	5	16861	November 18, 2022
Sagemaker downloads huggingface model image every time on running fit Amazon SageMaker	2	861	October 25, 2021
Save and deploy distilbert model in AWS SageMaker 🤗Transformers	2	2645	April 9, 2021

Using S3 as model cache for Huggingface LLM inference DLC on Sagemaker

Related topics