I want to demonstrate my model by enabling the widget on the model card.
However, it says “This model does not have enough activity to be deployed to Inference API (serverless) yet.”
After I purchased the " Inference Endpoints (dedicated)," and succeed set up the inference API, the widget on the model card is still disabled.
Are there any way to enable the example widget on the model card?
Thanks
The widget practically doesn’t work anymore…
The sad point is even i paid for the dedicated inference server. I am still unable to restore the inference widget.
Yeah… even if you charge for it, it won’t be valid…
@meganariley @not-lain Serverless Inference API issue, but it seems to be a payment issue as well.
Am I supposed to use “HF Space” instead? I’m trying to figure out how to create a gradio app and make it work on the space. unfortunately, the demo code don’t work with paid inference endpoint.
That would certainly be the quickest way to do it.
This is almost joke software, but it is a simple Demo to assist in creating a Demo space.
It can be used for more than just T2I.
If the model you are going to use is a well-known model, someone else may have already published a high-performance space.
Hi @fzmnm The Inference API is a free service for the community to test models directly in the browser or via HTTP requests. It uses shared resources and is not available for every model. For production or latency/availability sensitive use cases, we recommend using Inference Endpoints instead, which will allow you to easily deploy your models on dedicated, fully-managed infrastructure. Inference Endpoints gives you the flexibility to quickly create endpoints on CPU or GPU resources, and is billed by compute uptime vs character usage. Further pricing information can be found here.
Please let me know if you have additional questions.
Thank you very much for your resources.
My user case is that I want to share my homebrew small language models to my friends. It is a very small language model (92M) so it can run on a potato computer.
Currently, I successfully hosted it on the virtual machine dedicated for Space. The solution is very like the gradio app generator you provided. I use the free version of instance. Since it is a potato model so the free tier can handle it.
However, I’m still unable to connect the inference in gradio. i also try developing it locally. maybe I should see the endpoint forum for a good tutorial. but right now the vCPU of Space is enough for my potato model.
best,
Dear Ariley:
I have two questions:
- can I wire the Inference Endpoint I purchased to the Inference API, so people can test my model on the model card page?
- If I’m encouraged to use Spaces, how to wire Inference Endpoint to the Gradio app. The example HF provided does not work for the Inference Endpoint I purchased.
Best.
However, I’m still unable to connect the inference in gradio
If Spaces were set to Private, it would have been virtually impossible to use the Gradio space from the outside. It might be possible to do so by scraping using a virtual browser, but it would be too much of a hassle.
In Public settings should normally be accessible locally, but it would be faster to search this forum or github for know-how. There are many local rules that you have to search to find out.
You might also ask meganariley.