Internal server error when making multiple POST requests to HuggingFace API endpoint for embedding model sentence-transformers/all-MiniLM-L6-v2

gdi1 · July 19, 2023, 9:45am

I am making multiple consecutive POST requests to the endpoint https://api-inference.huggingface.co/pipeline/feature-extraction/sentence-transformers/all-MiniLM-L6-v2 to embed multiple chunks of text and it works for the first 4-5 requests, but then the next POST request I make takes a very long time and in the end I get back an “Internal server error” as a response.
The request’s body is as follows:

{"inputs": texts, "options":{"wait_for_model":True}}

Why is this happening?

Topic		Replies	Views
500 Internal Server Error for new model Beginners	2	91	March 26, 2025
Internal Error on using HF models 🤗Hub	5	261	April 15, 2025
"Bad Request: Your endpoint is in error, check its status on endpoints.huggingface.co Models	4	185	June 16, 2025
ValueError using HuggingFace API with LangChain Models	1	631	January 10, 2024
Model loading internal error 🤗Hub	2	85	April 14, 2025