Recently, this Space is not working properly. I’ve tried the factory reboot, but it doesn’t seem to work.
The log shows the following error:
Traceback (most recent call last): File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/gradio/routes.py", line 247, in run_predict output = await app.blocks.process_api( File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/gradio/blocks.py", line 641, in process_api predictions, duration = await self.call_function(fn_index, processed_input) File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/gradio/blocks.py", line 556, in call_function prediction = await anyio.to_thread.run_sync( File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/anyio/to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 867, in run result = context.run(func, *args) File "/home/user/app/model.py", line 1241, in run_with_translation frames = self.run(text, seed, only_first_stage,image_prompt) File "/home/user/app/model.py", line 1178, in run set_random_seed(seed) File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/SwissArmyTransformer/arguments.py", line 429, in set_random_seed torch.manual_seed(seed) File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/torch/random.py", line 40, in manual_seed torch.cuda.manual_seed_all(seed) File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/torch/cuda/random.py", line 113, in manual_seed_all _lazy_call(cb, seed_all=True) File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/torch/cuda/__init__.py", line 156, in _lazy_call callable() File "/home/user/.pyenv/versions/3.9.13/lib/python3.9/site-packages/torch/cuda/random.py", line 111, in cb default_generator.manual_seed(seed) RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
We haven’t changed the code for two weeks and the Space was working fine until a few days ago, though we needed to reboot the Space due to CUDA OOM from time to time (See this discussion). Also, it works fine in my GCP environment if I clone and run the Space.
How can I fix this?