I built a local AI workstation around an RTX 5090 (32 GB) for an uninterrupted, offline coding workflow.
OS: Debian 12 with a pinned NVIDIA .run driver (frozen for kernel stability).
LLMs: each in its own Python venv to keep the global stack clean.
Tools in a default “example-venv”: PyTorch, SciPy, NumPy, pandas, Matplotlib, scikit-learn.
Short demo + full setup notes:
→ https://localprompt.ai/demo.mp4
→ https://localprompt.ai
→ System Specifications – LocalPrompt.ai
Current favorite: DeepSeek-Coder-V2-Lite-Instruct (GGUF, Q8_0) for offline code help; I run it locally and use the venv to execute/validate.
I’d love feedback on two points:
- With a 32 GB GPU, which models are you finding best in practice as a coding assistant?
- For longer tasks, do you prefer a slightly smaller model with bigger context, or a stronger model accepting the risk of some forgetting of chat history?