How to deploy model on custom server?

arvind12 · February 20, 2024, 8:56am

Hi! I have finetuned a wav2vec2 on custom data for ASR. How can i deploy it on my own GPU server? what are the possible way to make our own server because cloud is very costly and I cannot afford it. I want to deploy it on my own GPU and want to give my customer an API for using it. how can i scale it to the 1000 of user?
If I deploy the model on my own server, do I need to create 1000 instances of the same model for 1000 customers to use it simultaneously?

nielsr · February 20, 2024, 8:56pm

Hi,

Usually people use Kubernetes in production, which scales Docker containers automatically based on the load.

This mean that you would first need to wrap your API in a Docker container. Wrapping an API is typically done using Flask or FastAPI.

Next, the Docker container could be automatically scaled using Kubernetes. Personally I don’t know whether it’s feasible to run Kubernetes locally, but I assume you can.

Topic		Replies	Views
Model Deploy On-prem Beginners	1	787	March 21, 2024
How to make a model like wav2vec or xls-r for my custom dataset and use it for fine tuneing Beginners	0	179	January 19, 2024
Training wav2vac2 requires a lot of compute power 🤗Transformers	0	193	March 21, 2023
How can I deploy and run models locally? Beginners	0	2322	July 29, 2022
Wav2vec2 Acces Feature Layers Performance 🤗Transformers	1	454	May 7, 2025

How to deploy model on custom server?

Related topics