Space startup never succeeds or ends, and 'Runtime error' when cloned

Hello everyone, I’ve contacted HuggingFace support, on 3 @dress emails, but with no success…

My space HF, which had been running fine for several weeks, suddenly, one day, although absolutely no action had been taken on it, failed to initialize when it comes to wake it up from sleeping mode, whatever the execution server and its resources, the launch phase lasts forever (message "Builing on T4 non stop for instance » for exemple), and produces no log.

Switching to “dev mode”, which would have been very useful for debugging, is also impossible…

(nothing happen when i press the button (of « dev mode), just a refresh behavior of the page but then, nothing happen, not even one log text).

Here is the name of the Space : « … /spaces/NoQuest/QP_AN «

Because i don’t want to fire the “Factory Rebuild” function…

(i don’t want to take the risk to lose any settings that i set at the times… Or loose any datas…or Git, and Git LFS states… and knowing also that a lot of heavy models files are there in the disk),

…so, i made a duplicate of the space (which is in public visibility on this path> « spaces/NoQuest/QP_AN_/settings" ), and when i start that clone space,

It produced the logs that I provide here, just under my signature.

But i’m more interested on using the original application of course (not the cloned version)

And Finally, if i make a Factory Rebuild on the clone, the same error happen, wich the same log in the attached file.

Anyway, i can’t access the space in SSH also on that clone version of my space.It s especially that fact that made me create this help ask cause if it s not, I would already investigate the problem by myself.

Thanks in advance for your help;

The famouse log :

"Runtime error

App process crashed

Container logs:

===== Application Startup at 2024-11-18 23:20:10 =====

False

The following directories listed in your path were found to be non-existent: {PosixPath(‘/usr/local/etc/pyenv.d’),

PosixPath(‘/usr/lib/pyenv/hooks’), PosixPath(‘/usr/etc/pyenv.d’), PosixPath(‘/etc/pyenv.d’)}

The following directories listed in your path were found to be non-existent: {PosixPath(‘Europe/Paris’)}

The following directories listed in your path were found to be non-existent: {PosixPath(‘//172.20.0.1’), PosixPath(‘tcp’),

PosixPath(‘443’)}

The following directories listed in your path were found to be non-existent: {PosixPath(‘NoQuest/QP_AN_’)}

The following directories listed in your path were found to be non-existent: {PosixPath(‘//172.20.0.1’), PosixPath(‘tcp’),

PosixPath(‘443’)}

CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths…

DEBUG: Possible options found for libcudart.so: {PosixPath(‘/usr/local/cuda/lib64/libcudart.so’)}

CUDA SETUP: PyTorch settings found: CUDA_VERSION=124, Highest Compute Capability: 7.5.

CUDA SETUP: To manually override the PyTorch CUDA version please

see: github /TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md

CUDA SETUP: Required library version not found: libbitsandbytes_cuda124.so. Maybe you need to compile it from

source?

CUDA SETUP: Defaulting to libbitsandbytes_cpu.so…

================================================ERROR=====================================

CUDA SETUP: CUDA detection failed! Possible reasons:

  1. You need to manually override the PyTorch CUDA version. Please see:

"github /TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md

  1. CUDA driver not installed

  2. CUDA not installed

  3. You have multiple conflicting CUDA libraries

  4. Required library not pre-compiled for this bitsandbytes release!

CUDA SETUP: If you compiled from source, try again with make CUDA_VERSION=DETECTED_CUDA_VERSION

for example, make CUDA_VERSION=113.

CUDA SETUP: The CUDA version for the compile might depend on your conda install. Inspect CUDA version via

conda list | grep cuda.

================================================================================

CUDA SETUP: Something unexpected happened. Please compile from source:

git clone github /TimDettmers/bitsandbytes.git

cd bitsandbytes

CUDA_VERSION=124

python setup.py install

CUDA SETUP: Setup Failed!

Traceback (most recent call last):

File “/home/user/app/server.py”, line 35, in

from modules import chat, loaders, presets, shared, training, ui, utils

File “/home/user/app/modules/chat.py”, line 17, in

from modules.text_generation import (

File “/home/user/app/modules/text_generation.py”, line 22, in

from modules.models import clear_torch_cache, local_rank

File “/home/user/app/modules/models.py”, line 10, in

from accelerate import infer_auto_device_map, init_empty_weights

File “/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/accelerate/init.py”, line 3, in

from .accelerator import Accelerator

File “/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/accelerate/accelerator.py”, line 35, in

from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state

File “/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/accelerate/checkpointing.py”, line 24, in

from .utils import (

File “/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/accelerate/utils/init.py”, line 131, in

from .bnb import has_4bit_bnb_layers, load_and_quantize_model

File “/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/accelerate/utils/bnb.py”, line 42, in

import bitsandbytes as bnb

File “/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/bitsandbytes/init.py”, line 6, in

from . import cuda_setup, utils, research

File “/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/bitsandbytes/research/init.py”, line 1, in

from . import nn

File “/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/bitsandbytes/research/nn/init.py”, line 1,

in

from .modules import LinearFP8Mixed, LinearFP8Global

File “/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/bitsandbytes/research/nn/modules.py”, line 8,

in

from bitsandbytes.optim import GlobalOptimManager

File “/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/bitsandbytes/optim/init.py”, line 6, in

from bitsandbytes.cextension import COMPILED_WITH_CUDA

File “/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/bitsandbytes/cextension.py”, line 20, in

raise RuntimeError(‘’’

RuntimeError:

CUDA Setup failed despite GPU being available. Please run the following command to get more information:

python -m bitsandbytes

Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them

to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes

and open an issue at: github /TimDettmers/bitsandbytes/issues"

1 Like

At HF Spaces, the situation has improved a lot since a while ago, but there are still quite a few errors.

NoQuest/QP_AN_

Not found…

a lot of heavy models files are there in the disk

If you have model files that are several GB or larger, or other large files in Spaces, the build may fail and Spaces may not start up, which may be related to this.