How many user requests can spaces process in parrallel?

harrycoppock · April 18, 2023, 12:57pm

Sorry if I missed this in the documentation.

Are user requests processed in parallel or sequential?

Many thanks in advance for your time.

radames · April 18, 2023, 5:57pm

Hi @harrycoppock

Here are some key points to consider:

Firstly, the request capacity will depend on the hardware resources behind the server.
Secondly, the processing of requests can be influenced by the chosen SDK.
If you’re using Gradio, it utilizes uvicorn and FastAPI as the web server, which can queue requests for your prediction function. Gradio also provides a way to batch requests for better performance: Setting Up A Demo For Maximum Performance
When using Streamlit, you can learn more about how it handles requests here: Does streamlit is running on a single-threaded development server by default or not? - #2 by tim - 🚀 Deployment - Streamlit
If you’re employing the Docker SDK, it’s up to you to set up a web server and manage multiple requests.

In summary, the processing of user requests will depend on the hardware resources and web server implementation.

harrycoppock · April 19, 2023, 3:08pm

Hi @radames

thank you for the speedy response - this is super useful. Wow an amazing service. Out of interest, why do you call them demos as it appears you can scale to many thousands of users?

radames · April 20, 2023, 12:12am

hi @harrycoppock for production inference we offer a more robust, autoscaling and dedicated infrastructure manage by us.

akgunomerfaruk · September 3, 2023, 8:06pm

Hi @radames – Has spaces replicas been discontinued? I’m a pro user and I don’t have access to this. Did you mean the enterprise clients? Also, how does this work with the persistent storages? Do replicas share a common volume? All the best!

radames · September 5, 2023, 4:27am

hi @akgunomerfaruk , please contact api-enterprise@huggingface.co for individual scaling requests, and the persistent storage is shared across replicas. Thanks for your interest .

coralexbadea · September 6, 2023, 4:05am

Thank you @radames for explanations. One more dumb question from me, the price will increase linearly with the number of GPUs that are added? For example if I need one more GPU because I have (predefined) too many request it will just add the price of that GPU and no extra cost?

radames · September 8, 2023, 12:21am

sorry, @coralexbadea could you also please reach out to api-enterprise@huggingface.co? Thanks

Topic		Replies	Views
How many people can use my Space at any time? Spaces	1	209	October 26, 2024
Increasing Response time for Gradio api Spaces	3	286	September 6, 2024
New Spaces Primitive Feature Request - Gradio plus Torch plus Transformers Docker spin up? 🤗Transformers	1	357	March 25, 2023
Let me speed up your Spaces Spaces	2	710	April 5, 2023
Can I use Spaces to host an API (not a demo site) Spaces	3	94	May 9, 2025

How many user requests can spaces process in parrallel?

Related topics