Productionizing HuggingFace Transformers?

NimaBoscarino · September 12, 2022, 11:55am

Hi there! I think the things that it depends on most are:

Your company’s existing stack
Your use-case (expected load, real-time vs. batched, etc.)

Can you share a bit more about what those things look like in your situation? Without that (IMO) it’s a bit difficult to give any recommendations. I could point you towards our Inference API service, for example (Inference API - Hugging Face), which lets you offload that to our infrastructure. Or you could take an approach like the one outlined here: How to Deploy NLP Models in Production - neptune.ai. Some companies might set up entire CI/CD situations if they need to constantly monitor, retrain, and redeploy their models (Continuous Delivery for Machine Learning).

If you have more details about your use-case I can definitely try to provide more details!

Topic		Replies	Views
What is best way to serve huggingface model with API? Beginners	11	42851	August 29, 2023
Deploy multilingual sentence tansformer into cloud Beginners	10	2712	July 16, 2021
Can I train and deploy a sentence transformer model using Huggingface estimator Models	0	642	June 6, 2022
Deploying Sentence Transformer as sagemaker endpoint Amazon SageMaker	18	8279	March 26, 2024
Using huggingface as a hosting / CDN for a pretrained model 🤗Transformers	0	141	November 29, 2024

Productionizing HuggingFace Transformers?

Related topics