Summary
Intermittent 504 errors from Hugging Face Inference API when generating embeddings with model mixedbread-ai/mxbai-embed-large-v1. Same request sometimes succeeds and sometimes fails within seconds.
Product
Hugging Face Inference API
Model
mixedbread-ai/mxbai-embed-large-v1
Impact
Embedding generation in a backend script is unreliable. Retries help only sporadically.
Environment
-
Client: Node.js script calling HF Inference API
-
OS: macOS on developer machine
-
Auth: HF API key in Authorization header
-
Payload: short English sentence ("This is a test sentence for embedding generation.”)
Minimal repro
-
Use the HF Inference API for embeddings with the model above.
-
Send the multiple requests in sequence
-
Observe alternating success and 504 responses.
Expected
Consistent 200 with embedding vector of length 1024.
Actual
Roughly alternating success and failure. Failures return 504 with an HTML body labeled “Hugging Face - The AI community building the future.” and title “504 Gateway Timeout.”