Subject: Hosted Inference returning 404 for multiple models (need assistance)

Hi Hugging Face Support,

I can access the Hub metadata (whoami and model_info succeed) but hosted inference calls return 404 for multiple models from my environment.

Details:

  • HF username: Hirtheesh
  • Environment: Windows, venv at C:\study\echoverse\venv
  • huggingface-hub version: (my local version)
  • Models tested and results:
    • google/flan-t5-large → 404 Not Found (x-request-id: Root=1-68ca4f88-087b50c61af9d0812349d41b)
    • sshleifer/tiny-gpt2 → 404 Not Found (tested just now)
  • Token: I verified my token is valid (whoami works). The token is set in the process environment for these tests.
  • Raw request diagnostics: POST to Models – Hugging Face returns 404 with headers including x-inference-provider: hf-inference and X-Cache: Error from cloudfront.

Could you confirm whether hosted Inference is enabled for my account/region and whether these models are available for hosted inference? If you need additional request IDs or headers, tell me what to capture and I’ll provide them.

Thanks,
Hirtheesh

1 Like

The Inference API has been revamped into Inference Providers, and the deployed models have changed significantly. These are the models currently deployed. flan-t5-large does not appear to be deployed and is therefore unavailable.

what model can i use instead of it

1 Like

It depends on the use case and your budget. The free tier only covers up to $0.01 worth of inference per month…
There seem to be several T5 models available.