Bart and Hugging Face Inference Endpoint working synchronously - can you help me?

WeFanz · June 29, 2024, 10:52am

Hi everyone, I’m currently using facebook/bart-large-mnli in order to generate the categorization of a phrase passed in input.

I’ve realized that the hugging face inference endpoint manage request synchronously and this is a big problem from what I want to do, counting the fact that the infrastructure needs to support hundreds of users.

Do you know any alternative which supports concurrent requests? Thanks!

WeFanz · July 1, 2024, 9:18am

Any update?

Topic		Replies	Views
Facebook/bart-large-mnli inference when deployed on SageMaker Amazon SageMaker	1	1081	April 29, 2022
How can I make my fine-tuned model supported by inference providers? Beginners	1	43	May 13, 2025
About the Inference Endpoints on the Hub category Inference Endpoints on the Hub	3	1648	May 8, 2025
Inference Endpoint for batch jobs Inference Endpoints on the Hub	0	289	May 24, 2024
Integration and Scale Inference Endpoints on the Hub	2	53	September 11, 2024

Bart and Hugging Face Inference Endpoint working synchronously - can you help me?

Related topics