How to connect Inference Endpoint to Model Card

fzmnm · October 16, 2024, 12:44am

I want to demonstrate my model by enabling the widget on the model card.
However, it says “This model does not have enough activity to be deployed to Inference API (serverless) yet.”
After I purchased the " Inference Endpoints (dedicated)," and succeed set up the inference API, the widget on the model card is still disabled.
Are there any way to enable the example widget on the model card?
Thanks

John6666 · October 16, 2024, 12:50am

The widget practically doesn’t work anymore…

fzmnm · October 16, 2024, 1:13am

The sad point is even i paid for the dedicated inference server. I am still unable to restore the inference widget.

John6666 · October 16, 2024, 1:44am

Yeah… even if you charge for it, it won’t be valid…

@meganariley @not-lain Serverless Inference API issue, but it seems to be a payment issue as well.

fzmnm · October 16, 2024, 2:33am

Am I supposed to use “HF Space” instead? I’m trying to figure out how to create a gradio app and make it work on the space. unfortunately, the demo code don’t work with paid inference endpoint.

John6666 · October 16, 2024, 3:49am

That would certainly be the quickest way to do it.

This is almost joke software, but it is a simple Demo to assist in creating a Demo space.
It can be used for more than just T2I.

If the model you are going to use is a well-known model, someone else may have already published a high-performance space.

meganariley · October 16, 2024, 2:05pm

Hi @fzmnm The Inference API is a free service for the community to test models directly in the browser or via HTTP requests. It uses shared resources and is not available for every model. For production or latency/availability sensitive use cases, we recommend using Inference Endpoints instead, which will allow you to easily deploy your models on dedicated, fully-managed infrastructure. Inference Endpoints gives you the flexibility to quickly create endpoints on CPU or GPU resources, and is billed by compute uptime vs character usage. Further pricing information can be found here.

Please let me know if you have additional questions.

fzmnm · October 16, 2024, 4:45pm

Thank you very much for your resources.

My user case is that I want to share my homebrew small language models to my friends. It is a very small language model (92M) so it can run on a potato computer.

Currently, I successfully hosted it on the virtual machine dedicated for Space. The solution is very like the gradio app generator you provided. I use the free version of instance. Since it is a potato model so the free tier can handle it.

However, I’m still unable to connect the inference in gradio. i also try developing it locally. maybe I should see the endpoint forum for a good tutorial. but right now the vCPU of Space is enough for my potato model.

best,

fzmnm · October 16, 2024, 4:47pm

Dear Ariley:
I have two questions:

can I wire the Inference Endpoint I purchased to the Inference API, so people can test my model on the model card page?
If I’m encouraged to use Spaces, how to wire Inference Endpoint to the Gradio app. The example HF provided does not work for the Inference Endpoint I purchased.
Best.

John6666 · October 16, 2024, 10:08pm

However, I’m still unable to connect the inference in gradio

If Spaces were set to Private, it would have been virtually impossible to use the Gradio space from the outside. It might be possible to do so by scraping using a virtual browser, but it would be too much of a hassle.
In Public settings should normally be accessible locally, but it would be faster to search this forum or github for know-how. There are many local rules that you have to search to find out.
You might also ask meganariley.

Topic		Replies	Views
Inference Provider Beginners	1	79	April 3, 2025
Create API Endpoint from hugging face space Spaces	0	1381	June 11, 2024
About the Inference Endpoints on the Hub category Inference Endpoints on the Hub	3	1654	May 8, 2025
Inference API stopped working Inference Endpoints on the Hub	50	4019	June 8, 2025
Inference Endpoints / Model choices / Help Inference Endpoints on the Hub	1	24	July 10, 2025

How to connect Inference Endpoint to Model Card

Related topics