I’m trying to setup an image-to-image, stable diffusion Inference Endpoint.
I saw CompVis/stable-diffusion-v-1-4-original which seems setup for the Inference Endpoint, but noticed this states it is only text-to-image, not image-to-image, so I assume this wouldn’t support generating images from an existing image.
Yes, lambdalabs/sd-image-variations-diffusers is supported by Inference Endpoints by using a custom handler.
Hugging Face Inference Endpoints be default support all of the Transformers and Sentence-Transformers tasks. if you want to deploy a custom model or customize a task, e.g. for diffusion you can do this by creating a Create custom Inference Handler with a handler.py.
The warning you saw indicates that the model doesn’t have a supported task and doesn’t include a handler.py.
Thanks so much for the swift reply here @philschmid.
This all seems super powerful, and your comment makes sense. It seems like there’s two paths ahead:
Use lambdalabs/sd-image-variations-diffusers and look to implement my own custom handler (handler.py)
Use CompVis/stable-diffusion-v-1-4-original and try to customise it to accept an image payload
I’ve used python a little before, but I’m a front-end dev by trade, so a little out of my depth here, I’ll look into this and try and see how far I can get with it, I’ll probably prioritise trying #1.
If you’ve any other tips or resources that are worth looking at, let me know!
Thanks for quick reply. I get that error before I have a chance to specify anything about a GPU. I only filled in the “philschmid/stable-diffusion-v1-4-endpoints” AWS and region. Then it goes to next screen and eventually fails before I can set anything else. I am probably missing something.
Thanks ! BTW when it says ~$450/month does it charge you that in a lump or is it pay as you go. IE can I try this out without being on the hook for monthly charge?
At the end of each month the user or organization account will be charged for the compute resources used while Endpoints are up and running, that is, in the “running” state.
So you will only pay the minutes of the uptime of you endpoint. You can learn more here: Hugging Face Inference Endpoint Pricing
This is a total noob question, but I guess I have to ask it somewhere
The repo philschmid/stable-diffusion-v1-4-endpoints doesn’t seem to show up as an available model in the Model Repository dropdown of the Create a new Endpoint page.
From this discussion, it looks like it used to show up since folks here seem to be using it as a starting point to set up a working stable diffusion based inference endpoint?
Today the philschmid/stable-diffusion-v1-4-endpoints repo is showing up in the inference endpoint UI, along with a bunch of other diffusion models. Maybe there was a bug that was fixed recently. In any case, I’m unblocked now.
I did see your excellent blog post earlier today and was able to get an inference endpoint up and running with text-to-image. It was extremely helpful.
While I have you, one follow up question that’s been a bit confusing for me to untangle: How similar is the inference API to the endpoint API? Are they in fact identical?
I also noticed that the inference API is turned off for https://huggingface.co/stabilityai/stable-diffusion-2. Is there a way for me to develop using model with the inference API somehow, or is using endpoint API the only way for now?
The Inference API is free and designed for testing and experimenting with models on huggingface.co. It runs on shared infrastructure. This means that you are sharing the resources with other users, which could lead to high latency and low throughput there is no guarantee that the model has a GPU. The Inference API is not providing SLAs or other production-required features, like logs and monitoring.
Inference Endpoints on the other side support all of this.
OK got it - so in some cases the inference API may not be available, but that’s not something that I as an end user will have any control over.
In general though, it sounds like the intended workflow is develop on the inference API where it’s available, and then graduate to endpoint API when you require the production solution. Thanks again for the responses!