Example Inference API (model & code ), pls

RaviRJoshi · June 27, 2025, 9:55pm

Hello. I am an absolute beginner, and getting started with HF / Inference API. etc.
I am trying to connect to some model and ask basic question. I have created a read access token, and am using following code to connect to gpt2. But I am getting a status code of response to 404 error.

API_TOKEN=“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”
MODEL = “openai-community/gpt2” #tried a few others as well
API_URL = f"https://api-inference.huggingface.co/models/{MODEL}"
headers = { “Authorization”: f"Bearer {API_TOKEN}"}
prompt = “Tell me about Paris.”
def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)

Can someone pls help?

John6666 · June 28, 2025, 1:20am

Since gpt2 does not appear to be deployed at this time, I will explain using a deployed model. Samples for Python, JavaScript, and curl can be obtained in the following manner.

Deployed models (by HF Inference API)

Deployed models (by all Inference Provider)

RaviRJoshi · June 28, 2025, 9:04am

Thanks John _/_. I am able to run chat thing.

Once I click Deploy, how does one know if it’s actually deployed? Is there a status such as ‘deployed’?
Is it possible to ‘undeploy’ ?

Underneath ‘deploy’ is a graph of number of downloads. Is it the number deployed or the model actually downloaded?

John6666 · June 28, 2025, 11:51am

Oh. Now that you mention it, that explanation is confusing.
The deployment status and the Deploy button above are not related.

The Deploy button is just a sample of how users can deploy the model or assist with deploying it to the necessary sites.

The deployment status indicates whether the model is currently deployed in the Inference Provider. It may change if Hugging Face staff deploy it or in response to Ask for provider support, but it doesn’t change often. Probably.

RaviRJoshi · June 28, 2025, 8:29pm

Thx again.
When I get ‘you have exceeded monthly limits …’. Is this msg coming from HF or from inference providers?
Also, how to find if a free tier is available from an inference provider, if this question makes sense.

John6666 · June 28, 2025, 8:51pm

When I get ‘you have exceeded monthly limits …’. Is this msg coming from HF or from inference providers?

from HF. I think Inference Provider is a mechanism where HF collects requests and manages billing.

Also, how to find if a free tier is available from an inference provider, if this question makes sense.

It’s like this. With a free account, it’s equivalent to $0.1 per month.

Topic		Replies	Views
Python HF Not Working Beginners	1	33	July 1, 2025
API call URL error Beginners	1	41	June 12, 2025
List models accessible via InferenceClient? Inference Endpoints on the Hub	1	78	April 9, 2025
Inference Endpoints - No working code examples Inference Endpoints on the Hub	3	157	January 29, 2025
Persistent 404 Not Found Errors with Public Inference API Inference Endpoints on the Hub	9	1241	June 20, 2025

Example Inference API (model & code ), pls

Deployed models (by HF Inference API)

Deployed models (by all Inference Provider)

Related topics