Inference API stopped working

rockmatas · April 15, 2025, 1:33am

Hi everyone,

I seem to be encountering a sudden issue with the Hugging Face Inference API. I was successfully using it for image-to-image tasks (specifically with models like stabilityai/stable-diffusion-xl-refiner-1.0), but it abruptly stopped working a short while ago.

Now, regardless of which model I try to use, I consistently receive the following error pattern:

Model <model_name> is not supported HF inference api

This is happening in two distinct ways:

Via InferenceClient: My application code, which was previously working, now gets this error for any model inference request.
Via Website Inference Widget: Even trying to use the Inference Widget directly on the model pages on the Hugging Face website results in the same error message appearing where the widget interface should be.

Here are a couple of specific examples I’ve tested:

Model: timbrooks/instruct-pix2pix
- Link: timbrooks/instruct-pix2pix · Hugging Face
- Output: Model timbrooks/instruct-pix2pix inference is not supported HF inference api
Model: stabilityai/stable-diffusion-xl-refiner-1.0
- Link: stabilityai/stable-diffusion-xl-refiner-1.0 · Hugging Face
- Output: Model stabilityai/stable-diffusion-xl-refiner-1.0 inference is not supported HF inference api
  (Note: I’ve tried several others with the same result)

Since this is affecting multiple models and occurs both through the API client and the website widget, it feels like it might be a broader issue rather than something specific to my setup or a single model.

Is anyone else experiencing similar behavior? Any information or confirmation would be greatly appreciated!

Thanks!

ovalai · April 15, 2025, 2:16am

We are seeing 404 error on these two
{“error”:“Model nvidia/cycle-diffusion does not exist”} Status: 404
{“error”:“Model yisol/IDM-VTON inference is not supported HF inference api”} Status: 404

John6666 · April 15, 2025, 2:42am

Probably the same thing happened.

javarribas · April 15, 2025, 8:51am

Hi there.
The issue persists across all models, even in multi-model Spaces like:
40 Models - a Hugging Face Space by Uthar

every request fails with:
‘Could not complete request to HuggingFace API, Status Code: 404, Error: Model (…) inference is not supported by HF inference API’.

pjayofficial · April 15, 2025, 3:45pm

I wonder if this will even get fixed or the model creators need to create new interfence api setup

John6666 · April 15, 2025, 3:49pm

It’s probably both. Since a renewal is being carried out that will eventually only allow the so-called Warm model to be used, it will ultimately be a battle for the entry slots, excluding the famous models.

github.com/huggingface/huggingface_hub

src/huggingface_hub/inference/_client.py

main


      
                      "candidate_labels": candidate_labels,
                      "hypothesis_template": hypothesis_template,
                  },
                  headers=self.headers,
                  model=model or self.model,
                  api_key=self.token,
              )
              response = self._inner_post(request_parameters)
              return ZeroShotImageClassificationOutputElement.parse_obj_as_list(response)
          
          @_deprecate_method(
              version="0.33.0",
              message=(
                  "HF Inference API is getting revamped and will only support warm models in the future (no cold start allowed)."
                  " Use `HfApi.list_models(..., inference_provider='...')` to list warm models per provider."
              ),
          )
          def list_deployed_models(
              self, frameworks: Union[None, str, Literal["all"], List[str]] = None
          ) -> Dict[str, List[str]]:
              """

rockmatas · April 16, 2025, 3:14pm

Do you know if there’s any fix, or will I have to move away from the Inference API?

javarribas · April 16, 2025, 3:27pm

"For three days now, practically no models are working…

But don’t worry, because on
https://status.huggingface.co/

All services are online

Well, when it actually goes offline…"

codermert · April 16, 2025, 5:19pm

So how do we use it on the model and the interface as before?

John6666 · April 17, 2025, 3:44am

Do you know if there’s any fix,

So how do we use it on the model and the interface as before?

idk…

greendra · April 17, 2025, 8:01pm

How can they just make a breaking change like this and then not comment on it?

SeDanny · April 17, 2025, 9:58pm

Hi all! Have you met any official information from the HF Team over this issue? Please share a link

John6666 · April 18, 2025, 3:12am

Have you met any official information from the HF Team over this issue?

Maybe here.

ParahumanSkitter · April 18, 2025, 3:23am

Hello there. When I first noticed this happening a few days ago, I decided to check the models page to check and verify which models were still working and which gave errors despite being “deployed”, and I noticed that the biggest models are still working fine with zero issues. If you look at the models page at this very moment and set the filters for text-to-image and the HF inference provider, you will notice that a grand total of 6 models are still deployed and functional, compared to over 25,000 models that were “deployed” a few days ago. Either these models are just built differently to be unaffected, or HF staff has these models segregated on their own servers that are kept seperate and safe from any issue with inferences that crop up. If it is the second, then the companies that maintain the base models are either shilling out to HF staff, HF staff already pushed their updated inference API for these models, or this is a semi-blatant case of cherry picking which models that Huggingface will support, which is anti-user friendly at best, and blatantly greedy at worst. Also, @John6666, if you use your Diffusion Crafter spaces, the ones with the ZERO GPU, you would notice that your models, and any models loaded by that space, all work successfully for generating images, while obviously they fail for serverless generation. If this keeps up, I may have to just use https://www.mage.space/ for the foreseeable future (I am a long time subscriber to their pro membership, which at $15 a month is a little pricy, but let’s me enjoy over 200+ text-to-image models, some of which are exclusive to mage.space, with unlimited generations as well as a text-to-video and image-to-video model (the Hunyaun model, which is a beta feature currently). There is a much pricier $30 per month subscription whose only real benefit is the ability to upload custome models from your PC or from HF or Civitai. The only issues with the website is the overzealous, highly inconsistent safety checker which can trigger for one generation but allow another generation to complete perfectly fine).

greendra · April 18, 2025, 9:28am

Even for models that are still served I keep getting ‘Model is loading (503)’ error. 50% of my requests now error.
Tried joining the discord but the verification is broken. Anyone know of any alternative service providers?

John6666 · April 18, 2025, 10:19am

Tried joining the discord but the verification is broken.

For this matter, it would be quicker to contact lunarflu on Discord. If I learn anything about inference, I will post it here as well.

rockmatas · April 18, 2025, 3:49pm

ParahumanSkitter:

Hello there. When I first noticed this happening a few days ago, I decided to check the models page to check and verify which models were still working and which gave errors despite being “deployed”, and I noticed that the biggest models are still working fine with zero issues. If you look at the models page at this very moment and set the filters for text-to-image and the HF inference provider, you will notice that a grand total of 6 models are still deployed and functional, compared to over 25,000 models that were “deployed” a few days ago. Either these models are just built differently to be unaffected, or HF staff has these models segregated on their own servers that are kept seperate and safe from any issue with inferences that crop up. If it is the second, then the companies that maintain the base models are either shilling out to HF staff, HF staff already pushed their updated inference API for these models, or this is a semi-blatant case of cherry picking which models that Huggingface will support, which is anti-user friendly at best, and blatantly greedy at worst. Also, @John6666, if you use your Diffusion Crafter spaces, the ones with the ZERO GPU, you would notice that your models, and any models loaded by that space, all work successfully for generating images, while obviously they fail for serverless generation. If this keeps up, I may have to just use https://www.mage.space/ for the foreseeable future (I am a long time subscriber to their pro membership, which at $15 a month is a little pricy, but let’s me enjoy over 200+ text-to-image models, some of which are exclusive to mage.space, with unlimited generations as well as a text-to-video and image-to-video model (the Hunyaun model, which is a beta feature currently). There is a much pricier $30 per month subscription whose only real benefit is the ability to upload custome models from your PC or from HF or Civitai. The only issues with the website is the overzealous, highly inconsistent safety checker which can trigger for one generation but allow another generation to complete perfectly fine).

I just checked the same but for image-to-image models (my personal use case), and there’s 0 models available. At this point this is making HF unusable, which really sucks, after having spent a lot of time developing with the Inference API in mind.

greendra · April 21, 2025, 11:29am

Been almost a week now and no word from HF…

AtherionGG · April 23, 2025, 11:46pm

Mine stopped for a while too getting 503:
“503 Server Error: Service Temporarily Unavailable for url: https://router.huggingface.co/hf-inference/models/black-forest-labs/FLUX.1-schnell”

I looked that you could set a provider.
I changed the parameters to the following and it worked.

from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="fal-ai",
    api_key="hf_xxxxxxxxxxxxxxxxxxxxxxxx",
)

Which you can find if you click view code on the model card.

Looks like maybe the router code is not functioning properly. I could only speculate but potential issue there.

charliebaby2023 · April 24, 2025, 11:39pm

yeah, lets talk alternate service providers

obviously the staff here could care less about its people.
im out of here,
just point me in a direction
tired of this *
ignoring us

there’s only one solution at this point

(HA! someone mention Uthar/40Models. Ha. thats me. well, yntec actually, i only slightly modified it figured it needed expanding. sorry, im new to seeing my(forked) code flying around. ive never ever share my code until i arrived here. that batch was the first ive shared in 35 years (minus codepen)

anyhow
who cares right.
whats important? NEW PROVIDERS.

mage you say?

Topic		Replies	Views
HF Inference API last few minutes returns the same 404 exception to all models Inference Endpoints on the Hub	45	2419	June 25, 2025
Inference API time out? Site Feedback	2	951	February 28, 2024
Inference API stopped working for my model 🤗Hub	11	5426	April 26, 2023
But is there even a single model working here?! Models	4	433	May 10, 2025
Stable Diffusion hub demo AND inference API not working Inference Endpoints on the Hub	0	481	March 6, 2023

Inference API stopped working

Related topics