SageMaker Inference for Model Tuned Elsewhere

Reading through the documentation for HuggingFace & SageMaker as we are evaluating it and found the following:

Q: Which models can I deploy for Inference?

A: You can deploy

  • any :hugs: Transformers model trained in Amazon SageMaker, or other compatible platforms and that can accomodate the SageMaker Hosting design
  • any of the 10 000+ publicly available Transformer models from the Hugging Face Model Hub, or
  • your private models hosted in your Hugging Face premium account!

Is it possible to fine-tune a model elsewhere, outside of SageMaker Training, (for instance, just through a regular PyTorch training loop on a pretrained transformers model), and then deploy it for Inference without hosting it on an account?

Would appreciate any pointers y’all can give on this.

Hello @charlesatftl,

Yes, you can fine-tune transformers anywhere you want and use it in SageMaker for Inference.

There are currently two options to use your model then:

  1. Push your model to Models - Hugging Face and deploy it directory from there, see: Deploy models to Amazon SageMaker
  2. Create a model.tar.gz upload it to s3 and deploy it from there, see: Deploy models to Amazon SageMaker
    In Addition to 2. here is the documentation on how to create a model.tar.gz Deploy models to Amazon SageMaker
1 Like

Thank you! This is very helpful.

Now running into issues when I attempt to use a model as you suggested with a modified model.tar.gz. Specifically, when I add a code directory with inference.py and requirements.txt, it starts throwing errors even if those are both empty files, saying that it cannot find the config. The same model with the code directory removed works properly. Error is along the lines of “Can’t load config for … Make sure that: … is a correct model identifier listed on Models - Hugging Face… or … is the correct path to a directory containing a config.json file.”

Update: I believe this was an error on my end. You’ve gotta be careful to tar the model with the right flags or else the archive won’t have the right structure (i.e. it’ll have the files nested in some relative path). :slight_smile: