I am new to HF Spaces. Here is the project I’m imagining:
I’m working on an iOS app to take a hand drawn image, pass it through ControlNet, and get back a diffusion image to then print out. The goal would be to bring this to a group of kids to let them play with diffusion and get a print out of their own work enhanced by Diffusion.
I know how to run ControlNet in a Colab notebook. I am wondering what the simplest translation of this is to a HF Space. I don’t necessarily need a frontend for the experience - I think we will build that in iOS - so I don’t think a Streamlit or Gradio frontend makes sense. Really I just want a persistent server which can serve as an endpoint to ControlNet (effectively the code from a Python notebook) for an iOS app. I see the word Docker and I flinch (“is Docker tech overkill”) but is that the right way to interact with HF spaces if what I effectively want to do is turn a Colab notebook into a persistent endpoint I can call from say an iOS app?
Spaces are designed for demos and sharable ML apps. You might be interested in our Inference Endpoints solution which provides exactly what you’re looking for, a dedicated, fully managed inference server. You might need to convert your Jupiter notebook to a custom handler python code and with a few clicks you can deploy and have an API endpoint
Luckily @philschmid wrote a really comprehensive tutorial about deploying ControlNet as an inference endpoint, so you can just copy adapt his handler.py
You can also see multiple models templates on our hub.
Thank you! I think this is exactly what we’re looking for.
I will look further, but two quick questions:
I know HF provides grants for Spaces. Are there grants for endpoints? The end goal of this is an educational demo (exploring ControlNet with kids – draw a picture and pass it into ControlNet.) and if not –
I will look into this, but I think the main thing we will need up time for is a) testing and b) actual use with kids. I assume that’s easy to manage (turning an endpoint on and off?)
I’m not sure about grants for Inference endpoints, since it’s a production ready service cc @philschmid
On another note, we’re going to release the image-to-image task, which includes ControlNet, in our API Inference with offers free access with rate limit and higher rates for Pro users, however this service is not optimized for high loads and optimized inference, but it could be a good start.
Hello again @radames!
I have now looked at the blog post you shared - which is exactly what we are looking for, an endpoint for ControlNet.
For pricing an endpoint - is it pay by use, or is provisioned for as long as the endpoint is up? I ask because the suggested GPU option from Phil’s blogpost, “GPU medium” is estimated at $900 / month, and “GPU small” is ~$400 / month.
I think if our project really got going we’d love to pay for a full time endpoint but for now we are just in the phases of building an iOS frontend to diffusion models so that we can run a few initial workshops for kids on drawing and diffusion and see if there really is a thing there. (I think we’d be using the endpoint a few hours a time at most.)
Do you have any suggestions – perhaps if there is a grant we could apply for like for HF spaces – or if there is a lower cost way to provision an endpoint with a GPU?
Thank you!!
it seems that the endpoint generates cost for the whole duration of being up, not only for computing duration. how did you manage this so that you have both low cost and decent user experience?