Spaces and "Building" stuck, infra side issue and how to troubleshoot further?

Hello. I have 2 private spaces (one on my account, other org). I have them both set on sleep timers. They have spun up OK in the past if go to sleep. Since this AM, both are stuck in “Building” status. The Build logs show image pushed and exporting cache successfully. The Container Logs show start up (below), but the app is not starting (have some logging which typically spits out if working). I have done the Restart Space and Factory Reboot. Also tried changing the Space Hardware to see if was hardware resource contention. Still “Building …”

(1) I see a few comments referring to build queue, is this indicated somewhere in logs or other to check out? Sorry if I missed where it was indicated. (EDIT: see the “Build Queued” at the top of the Build Logs, so might just need to wait for resource availability)

(2) the huggingface service status seems to indicate things are OK. Is something going on on the infra side?

Reading some of the threads, it sounds like waiting until something is updated on HG side is the typical resolution. Anything else I can do to get it working again?



===== Application Startup at 2023-09-08 19:17:32 =====

Not sure if it helps, but have received a different message when I try to do the factory reboot. Still getting through the build queue successfully and the

Container Logs (after long waiting period):

Error: Failed to load logs: Not Found. Logs are persisted for 30 days after the Space stops running.

Build Logs tail:

→ COPY --link --chown=1000 --from=lfs /app /home/user/app
DONE 0.0s

→ COPY --link --chown=1000 ./ /home/user/app
DONE 0.0s

→ Pushing image
DONE 23.4s

→ Exporting cache
DONE 6.9s

sorry, could you please try duplicating your Space and see if the new Space builds and run successfully?

Thanks @radames . I just cloned, it downloaded all of the model info and started up!

Which I am guessing indicates something in HF_HOME got corrupted or other (configured as /data/.huggingface in the Space). Is there a way of deleting that folder (removing from the variables, running, and adding back)? Or should I just blow away the old Spaces and go with the clones?

(EDIT2: I see option to “Remove current storage” in Settings, get this error when try to add back storage, might just take time to reset

“Error while upgrading the persistent storage: An error happened while upgrading your Space’s storage. Status code: 409”)

Thanks @ecarr-compoze for the feedback, I think it’s the same issue related to this

We’ll investigate with infra next week cc @chris-rannou