Issue with CUDA Availability on A10 GPU Instance of space

We have been using the Huggingface Spaces to host a model demo, utilizing the A10 GPU instance. Until recently, everything was functioning as expected and our demo was running smoothly.

However, we have suddenly encountered an issue where our demo ceased to operate. Upon investigation, we found out that we are no longer able to utilize CUDA with PyTorch on our instance. Specifically, when we run the code torch.cuda.is_available(), it returns False. Moreover, the error message suggests that the issue might be related to an outdated version of the CUDA driver.

Anyone knows of a solution to this problem?

@chris-rannou i @tanahhh ,

Could you please share more about your Space? Are you using a Docker or Gradio SDK?
cc @chris-rannou maybe recent internal infra change has impact on this?

@radames

We don’t use Docker.
Our project has just requirements.txt and app.py. Each code is the following.

accelerate
protobuf
sentencepiece
torch>=2.0.1
pillow
transformers
import gradio as gr
import torch

if __name__ == "__main__":
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    print(device, flush=True)