Image-to-image Stable Diffusion Inference Endpoint?

Hey there,

I’m trying to setup an image-to-image, stable diffusion Inference Endpoint.

I saw CompVis/stable-diffusion-v-1-4-original which seems setup for the Inference Endpoint, but noticed this states it is only text-to-image, not image-to-image, so I assume this wouldn’t support generating images from an existing image.

Then I looked at using lambdalabs/sd-image-variations-diffusers, but upon getting ready to set this up, I noticed this message:

Deploying this model with a custom task may fail as a handler.py file was not found in the repository. Refer to the documentation for more details.

When searching for a stable-diffusion model, lambdalabs/sd-image-variations-diffusers was the only result.

Is lambdalabs/sd-image-variations-diffusers setup to work with the Inference Endpoint, or is there another model that is?

Thanks folks!
Fred.

Hello Fred,

Yes, lambdalabs/sd-image-variations-diffusers is supported by Inference Endpoints by using a custom handler.
Hugging Face Inference Endpoints be default support all of the :hugs: Transformers and Sentence-Transformers tasks. if you want to deploy a custom model or customize a task, e.g. for diffusion you can do this by creating a Create custom Inference Handler with a handler.py.

The warning you saw indicates that the model doesn’t have a supported task and doesn’t include a handler.py.

We already have an example on how you would deploy stable-diffusion: philschmid/stable-diffusion-v1-4-endpoints · Hugging Face. With a custom handler you can customize the request payload to accept images and text, see this zero-shot-image-classification example.

Thanks so much for the swift reply here @philschmid.

This all seems super powerful, and your comment makes sense. It seems like there’s two paths ahead:

  1. Use lambdalabs/sd-image-variations-diffusers and look to implement my own custom handler (handler.py)
  2. Use CompVis/stable-diffusion-v-1-4-original and try to customise it to accept an image payload

I’ve used python a little before, but I’m a front-end dev by trade, so a little out of my depth here, I’ll look into this and try and see how far I can get with it, I’ll probably prioritise trying #1.

If you’ve any other tips or resources that are worth looking at, let me know!

Cheers.

1 Like

Fred,

Did you get this to work. My install fails repeatedly with this in the tail of the logs.

tbghq 2022-10-21T12:58:51.065Z 2022-10-21 12:58:51,065 | INFO | Found custom pipeline at /repository/handler.py
tbghq 2022-10-21T12:58:51.098Z File “/opt/conda/lib/python3.9/site-packages/starlette/routing.py”, line 566, in aenter
tbghq 2022-10-21T12:58:51.098Z 2022-10-21 12:58:51,098 | ERROR | Traceback (most recent call last):
tbghq 2022-10-21T12:58:51.098Z File “”, line 850, in exec_module
tbghq 2022-10-21T12:58:51.098Z File “/app/./huggingface_inference_toolkit/handler.py”, line 44, in get_inference_handler_either_custom_or_default_handler
tbghq 2022-10-21T12:58:51.098Z File “/opt/conda/lib/python3.9/site-packages/starlette/routing.py”, line 671, in lifespan
tbghq 2022-10-21T12:58:51.098Z 2022-10-21 12:58:51,098 | ERROR | Application startup failed. Exiting.
tbghq 2022-10-21T12:58:51.098Z inference_handler = get_inference_handler_either_custom_or_default_handler(HF_MODEL_DIR, task=HF_TASK)
tbghq 2022-10-21T12:58:51.098Z async with self.lifespan_context(app):
tbghq 2022-10-21T12:58:51.098Z ValueError: need to run on GPU
tbghq 2022-10-21T12:58:51.098Z File “/opt/conda/lib/python3.9/site-packages/starlette/routing.py”, line 648, in startup
tbghq 2022-10-21T12:58:51.098Z File “/repository/handler.py”, line 13, in
tbghq 2022-10-21T12:58:51.098Z File “/app/./huggingface_inference_toolkit/utils.py”, line 202, in check_and_register_custom_pipeline_from_directory
tbghq 2022-10-21T12:58:51.098Z spec.loader.exec_module(handler)
tbghq 2022-10-21T12:58:51.098Z await self._router.startup()
tbghq 2022-10-21T12:58:51.098Z await handler()
tbghq 2022-10-21T12:58:51.098Z File “/app/./webservice_starlette.py”, line 56, in some_startup_task
tbghq 2022-10-21T12:58:51.098Z raise ValueError(“need to run on GPU”)
tbghq 2022-10-21T12:58:51.098Z
tbghq 2022-10-21T12:58:51.098Z File “”, line 228, in _call_with_frames_removed
tbghq 2022-10-21T12:58:51.098Z custom_pipeline = check_and_register_custom_pipeline_from_directory(model_dir)

Hello @dano1234,

It seems that you are not using a GPU instance. raise ValueError(“need to run on GPU”)

Phil,

Thanks for quick reply. I get that error before I have a chance to specify anything about a GPU. I only filled in the “philschmid/stable-diffusion-v1-4-endpoints” AWS and region. Then it goes to next screen and eventually fails before I can set anything else. I am probably missing something.

Dan

@dano1234 to select the instance type you need to click on “advanced configuration” and there you can select the instance type:

You can learn more about in the documentation: Advanced Setup (Instance Types, Auto Scaling, Versioning)

Thanks ! BTW when it says ~$450/month does it charge you that in a lump or is it pay as you go. IE can I try this out without being on the hook for monthly charge?

Hey @dano1234,

At the end of each month the user or organization account will be charged for the compute resources used while Endpoints are up and running, that is, in the “running” state.
So you will only pay the minutes of the uptime of you endpoint. You can learn more here: Hugging Face Inference Endpoint Pricing

Thanks worked great.

This is a total noob question, but I guess I have to ask it somewhere :slight_smile:

The repo philschmid/stable-diffusion-v1-4-endpoints doesn’t seem to show up as an available model in the Model Repository dropdown of the Create a new Endpoint page.

From this discussion, it looks like it used to show up since folks here seem to be using it as a starting point to set up a working stable diffusion based inference endpoint?

Today the philschmid/stable-diffusion-v1-4-endpoints repo is showing up in the inference endpoint UI, along with a bunch of other diffusion models. Maybe there was a bug that was fixed recently. In any case, I’m unblocked now.

Hey @dskill,

We launched some new exciting features yesterday: Philipp Schmid on LinkedIn: Stable Diffusion with Hugging Face Inference Endpoints

Which added support for all text-to-image diffusion models based on StableDiffusionPipeline.

Lucky timing for me! :slight_smile:

I did see your excellent blog post earlier today and was able to get an inference endpoint up and running with text-to-image. It was extremely helpful.

While I have you, one follow up question that’s been a bit confusing for me to untangle: How similar is the inference API to the endpoint API? Are they in fact identical?

I also noticed that the inference API is turned off for https://huggingface.co/stabilityai/stable-diffusion-2. Is there a way for me to develop using model with the inference API somehow, or is using endpoint API the only way for now?

Again, many thanks.

The Inference API is free and designed for testing and experimenting with models on huggingface.co. It runs on shared infrastructure. This means that you are sharing the resources with other users, which could lead to high latency and low throughput there is no guarantee that the model has a GPU. The Inference API is not providing SLAs or other production-required features, like logs and monitoring.

Inference Endpoints on the other side support all of this.

OK got it - so in some cases the inference API may not be available, but that’s not something that I as an end user will have any control over.

In general though, it sounds like the intended workflow is develop on the inference API where it’s available, and then graduate to endpoint API when you require the production solution. Thanks again for the responses!