SageMaker Inference for Model Tuned Elsewhere

charlesatftl · August 30, 2021, 5:28pm

Reading through the documentation for HuggingFace & SageMaker as we are evaluating it and found the following:

Q: Which models can I deploy for Inference?

A: You can deploy

any Transformers model trained in Amazon SageMaker, or other compatible platforms and that can accomodate the SageMaker Hosting design

any of the 10 000+ publicly available Transformer models from the Hugging Face Model Hub, or

your private models hosted in your Hugging Face premium account!

Is it possible to fine-tune a model elsewhere, outside of SageMaker Training, (for instance, just through a regular PyTorch training loop on a pretrained transformers model), and then deploy it for Inference without hosting it on an account?

Would appreciate any pointers y’all can give on this.

philschmid · August 30, 2021, 5:42pm

Hello @charlesatftl,

Yes, you can fine-tune transformers anywhere you want and use it in SageMaker for Inference.

There are currently two options to use your model then:

Push your model to Models - Hugging Face and deploy it directory from there, see: Deploy models to Amazon SageMaker
Create a model.tar.gz upload it to s3 and deploy it from there, see: Deploy models to Amazon SageMaker
In Addition to 2. here is the documentation on how to create a model.tar.gz Deploy models to Amazon SageMaker

charlesatftl · August 30, 2021, 6:16pm

Thank you! This is very helpful.

charlesatftl · September 2, 2021, 3:06pm

Now running into issues when I attempt to use a model as you suggested with a modified model.tar.gz. Specifically, when I add a code directory with inference.py and requirements.txt, it starts throwing errors even if those are both empty files, saying that it cannot find the config. The same model with the code directory removed works properly. Error is along the lines of “Can’t load config for … Make sure that: … is a correct model identifier listed on Models - Hugging Face… or … is the correct path to a directory containing a config.json file.”

charlesatftl · September 2, 2021, 3:32pm

Update: I believe this was an error on my end. You’ve gotta be careful to tar the model with the right flags or else the archive won’t have the right structure (i.e. it’ll have the files nested in some relative path).

Topic		Replies	Views
How to use fine tuned Hugging face model saved at S3 at inference time? Amazon SageMaker	1	5067	May 4, 2022
Use my finetuned Bert Model in SageMaker BatchTransform Amazon SageMaker	4	2971	April 30, 2022
Infer with SageMaker for a Private Model Amazon SageMaker	3	2423	June 30, 2022
How can I adapt this code to deploy it in HuggingFace? Beginners	0	241	September 10, 2023
Inference Toolkit - custom inference with multiple models Amazon SageMaker	1	633	April 4, 2024

SageMaker Inference for Model Tuned Elsewhere

Related topics