Example Inference API (model & code ), pls

Hello. I am an absolute beginner, and getting started with HF / Inference API. etc.
I am trying to connect to some model and ask basic question. I have created a read access token, and am using following code to connect to gpt2. But I am getting a status code of response to 404 error.

API_TOKEN=“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”
MODEL = “openai-community/gpt2” #tried a few others as well
API_URL = f"https://api-inference.huggingface.co/models/{MODEL}"
headers = { “Authorization”: f"Bearer {API_TOKEN}"}
prompt = “Tell me about Paris.”
def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)

Can someone pls help?

1 Like

Since gpt2 does not appear to be deployed at this time, I will explain using a deployed model. Samples for Python, JavaScript, and curl can be obtained in the following manner.

Deployed models (by HF Inference API)

Deployed models (by all Inference Provider)

Thanks John _/_. I am able to run chat thing.

Once I click Deploy, how does one know if it’s actually deployed? Is there a status such as ‘deployed’?
Is it possible to ‘undeploy’ ?

Underneath ‘deploy’ is a graph of number of downloads. Is it the number deployed or the model actually downloaded?

1 Like

Oh. Now that you mention it, that explanation is confusing.:sweat_smile:
The deployment status and the Deploy button above are not related.

The Deploy button is just a sample of how users can deploy the model or assist with deploying it to the necessary sites.

The deployment status indicates whether the model is currently deployed in the Inference Provider. It may change if Hugging Face staff deploy it or in response to Ask for provider support, but it doesn’t change often. Probably.

Thx again.
When I get ‘you have exceeded monthly limits …’. Is this msg coming from HF or from inference providers?
Also, how to find if a free tier is available from an inference provider, if this question makes sense.

1 Like

When I get ‘you have exceeded monthly limits …’. Is this msg coming from HF or from inference providers?

from HF. I think Inference Provider is a mechanism where HF collects requests and manages billing.

Also, how to find if a free tier is available from an inference provider, if this question makes sense.

It’s like this. With a free account, it’s equivalent to $0.1 per month.