sentence-transformers/all-MiniLM-L6-v2 Not working all of a sudden

MrMushroom21 · April 28, 2025, 5:29pm

POST requests to https://router.huggingface.co/hf-inference/models/sentence-transformers/all-MiniLM-L6-v2 return error code 422 (Unprocessable Content) all of sudden

marioluigi112 · April 28, 2025, 11:17pm

Yeah I’m having the same issue using the API to send requests in Unity. Even using the inference provider from the model page directly results in the same error sentence-transformers/all-MiniLM-L6-v2 · Hugging Face. It also seems to persist with a lot of the other sentence transformers.

gtvracer · April 28, 2025, 11:24pm

I’m getting 404 errors using InferenceClient() on meta-llama/Llama-3.3-70B-Instruct, meta-llama/Llama-3.1-8B-Instruct, Mixtral-8x7B-Instruct-v0.1, and mistralai/Mistral-7B-Instruct-v0.3. Basically any InferenceClient() calls! Am I not alone in this?

John6666 · April 29, 2025, 12:20am

Same here… @michellehbn

gtvracer · April 29, 2025, 2:13am

For those stuck with HF, get a free Mistral account to get the API key. Then you can use this class to do text generation. The chat_stream() will simulate the packets HF’s InferenceClient() would return, so it can be a plug and play…


import os
import config
from mistralai import Mistral

class TextPacket:
    def __init__(self):
        self.choices = []

class TextMessage:
    def __init__(self):
        self.role:str = None
        self.content:str = None

class TextGroup:
    def __init__(self):
        self.index = 0
        self.finish_reason:str = None
        self.delta:TextMessage = TextMessage()
        self.message:TextMessage = TextMessage()

class MistralGenerator():
    def __init__(self):
        self.api_key = config.MISTRALAI_APIKEY
        self.model = "mistral-small-latest"
        self.client = Mistral(api_key=self.api_key)

    def chat_complete(self, query, max_tokens=512, temperature=0.7, top_p=0.9):
        chat_response = self.client.chat.complete(
            model= self.model,
            messages = [
                {
                    "role": "user",
                    "content": query,
                },
            ],
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
        )
        print(chat_response.choices[0].message.content)
        return chat_response.choices[0].message.content

    def chat_stream(self, messages, max_tokens=512, temperature=0.7, top_p=0.9):
        stream_response = self.client.chat.stream(
            model=self.model,
            messages=messages,
            max_tokens=max_tokens,
            temperature=temperature,
            top_p=top_p,
            stream=True
        )

        for chunk in stream_response:
            #print(chunk.data.choices[0].delta.content)
            message = TextPacket()
            group = TextGroup()
            group.index = 0
            group.delta.role = "assistant"
            group.delta.content = chunk.data.choices[0].delta.content
            message.choices.append(group)
            yield message

        # Final stop message for stream
        message = TextPacket()
        group = TextGroup()
        group.index = 0
        group.delta.role = "assistant"
        group.delta.content = ""
        group.delta.finish_reason = "stop"
        message.choices.append(group)
        yield message

John6666 · April 29, 2025, 10:54am

Perhaps resolved?

John6666 · April 29, 2025, 11:51am

Maybe fixed. From HF Discord:

Tom Aarsen
I’ve asked internally, and they indeed reported an issue, but it has been resolved now! Apologies

marioluigi112 · April 29, 2025, 6:09pm

Yes, the issue with the sentence transformers has been fixed now.

RichLivMyB4 · May 8, 2025, 9:27am

I seem to still be getting this. Anyone else or is it just me? I don’t think it’s a token issue. The error message is not very helpful either:

curl -X POST -H ‘Accept: application/json’ -H ‘Content-Type: application/json’ -H ‘Authorization: XXXX’ -d ‘{“inputs”:[“What is fraud”],“options”:{“wait_for_model”:true}}’ ‘https://api-inference.huggingface.co/pipeline/feature-extraction/sentence-transformers/all-MiniLM-L6-v2’

{“status”:422,“body”:{“status_code”:404,“message”:“Unexpected token ‘N’, "Not Found" is not valid JSON”,“payload”:“Not Found”}}

John6666 · May 8, 2025, 10:12am

As for SentenceTransformers, I think it’s probably in this state.

Topic		Replies	Views
Are InferenceClient()'s down? Beginners	10	285	July 3, 2025
404 to any API i tried Beginners	5	83	July 7, 2025
Function calling not working with inference clients on (seemingly) any model Beginners	10	590	February 8, 2025
https://api-inference.huggingface.co/models/sentence-transformers/paraphrase-MiniLM-L6-v2 Beginners	7	323	May 8, 2025
TypeError: InferenceClient.text_generation() got an unexpected keyword argument Beginners	1	56	June 28, 2025

sentence-transformers/all-MiniLM-L6-v2 Not working all of a sudden

Related topics