API error for model sentence-transformers/all-MiniLM-L6-v2

Hello,

Starting today my app has become unusable due to a consistent error being returned from the huggingface api.

The model I’m using is sentence-transformers/all-MiniLM-L6-v2.

The error I receive for every API call is this:

An error occurred while fetching the blob

My abbreviated backend code looks like this:

import { HfInference } from “@huggingface/inference”;

const {

  text,

  history,

  walletAddress,

  signature,

  message,

  token

} = req.body;

const getEmbedding = async (text) => {

  try {

    const hf = new HfInference(process.env.HF_API_TOKEN);

    return await hf.featureExtraction({

      model: "sentence-transformers/all-MiniLM-L6-v2",

      inputs: text

    });

  } catch (error) {

    throw new Error("Embedding failed: " + (error.message || "Unknown"));

  }

};

Please let me know what I need to do to resolve this error. This used to work perfectly fine until today without any changes made on my end.

1 Like

I’m now also seeing a 504 error with this same HF model api call in another module where I create embeddings to upsert content to my vector database, so now my upsert process has also come to a halt.

2 Likes

Due to this change ?

That change appears to be from May 14 and I only started using the model in July, and without problems until today, so that doesn’t seem to be the cause.

1 Like

Yeah, same here I started using this model on august with the correct inference api URL but the problem started today. I’m receiving a 504 gateway timeout from the api after a 2m response delay.

2 Likes

Hmm, it seems to be working by the Python client…?

from huggingface_hub import InferenceClient

HF_TOKEN = "hf_***my_read_token***"

client = InferenceClient(
    provider="hf-inference",
    api_key=HF_TOKEN,
)

result = client.sentence_similarity(
    "That is a happy person",
    other_sentences=[
        "That is a happy dog",
        "That is a very happy person",
        "Today is a sunny day"
    ],
    model="sentence-transformers/all-MiniLM-L6-v2",
)
print(result) # [0.6945773363113403, 0.9429150223731995, 0.25687623023986816]

Seems fixed by HF Staff.

It started working again briefly on my end but now all requests are back to failing again.

1 Like

Seems same here.

1 Like