I have a Stable Diffusion Web UI that I’m attempting to run on A10G - Small on Huggingface. Unfortunately, it seems to get stuck at “Scheduling Space” for around 30-60+ minutes every time I attempt to wake the interface up.
Is there something I’m doing wrong? I wish Colab could provide a persistent URL - that would be an alternate solution.
Ok, it’s actually over 60 minutes most of the time on “scheduling spaces” on A10G - Small.
I am updated the Dockerfile to use @camenduru’s T4 Private version since that comes with 30gb of RAM over the 15 that are in the A10G Small and am trying again.
Does anyone know how to speed up the waking of a space?
It seems the Docker image yielded by the Dockerfile in your Space iamrobotbear/webui-docker is too large to fit on A10G small hardware leading to this scheduling issues.
To solve this you should consider either slimming down the image by maybe removing some checkpoints or use a flavor with more storage available such as A10G large.
How do I know when a space is running low on storage? There’s no mention of it in the build or container logs?
If I am out of storage why would the space eventually start on either T4 Medium or A10G Small?
When I add up the size of each of my checkpoints I’m at 15.04gb - obviously there are all of the dependencies and the OS, but since there’s no interactive terminal how can I check?
If I wanted to build the exact copy of my space locally is there a doc you can point me to? Since I don’t have that hardware and it would be built locally on my MacBook Pro I’m not sure how to configure my Dockerfile?
Are you able to identify how large my total image is from your side since you have access to the backend?
How long should I expect for it to take to start my space up if I chop this down to a single checkpoint on A10G?
Lastly, @chris-rannou or @camenduru I’m trying to add authentication (either via basic user/pass) or even better yet, SSO (probably a pipe dream) to the space, is that possible via Sharing Your App or via adding secrets to the Spaces configuration since this is a Docker build?
Essentially I’d like to provide access to the UI without everyone having to have a Hugging Face account/be added to my organization and not permit the public to run up a huge bill on A10G Large if the space is public.
You should be able to build this Dockerfile without having the required hardware.
Currently your image is at about 100+ GB. This is mainly due to some optimization issue on the Dockerfile definition (chown at the end). I’ll soon come back to you with a suggested optimization.
Once optimized and if you reduce to a single checkpoints the startup should not be longer than up to 10 min (difficult to estimate beforehand). The main delay currently is because of the time required to download the image.
Traceback (most recent call last):
File "/content/stable-diffusion-webui/webui.py", line 12, in <module>
from modules.call_queue import wrap_queued_call, queue_lock, wrap_gradio_gpu_call
File "/content/stable-diffusion-webui/modules/call_queue.py", line 7, in <module>
from modules import shared
File "/content/stable-diffusion-webui/modules/shared.py", line 125, in <module>
os.makedirs(cmd_opts.hypernetwork_dir, exist_ok=True)
File "/usr/lib/python3.10/os.py", line 225, in makedirs
mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/content/stable-diffusion-webui/models/hypernetworks'
@chris-rannou how can we solve this without chown and chmod
@chris-rannou Do I need to merge it or will it automatically build?
Any guidance on how I can restrict access to the space so that it’s not publicly accessible? I want to be able to provide access to the URL or ideally iframe / webcomponent without having the space be public.
To solve this you should consider either slimming down the image by maybe removing some checkpoints or use a flavor with more storage available such as A10G large.