Hello experts please help on running local DeepSeek-R1-0528-Qwen3-8B

Has anyone installed GeForce RTX 5070 12GB to accelerate the DeepSeek-R1-0528-Qwen3-8B model? I am trying on a budget, not wanting to buy RTX 5080 16GB for a local setup.

1 Like

The 8B model is larger than expected and actually does not fit into 16GB of VRAM. If you quantize it to reduce the size, it can fit into 6GB, but the process is slightly more complicated than not quantizing it, and conversion work is required.
However, it’s not that difficult now.

Using Ollama eliminates most of the setup work, and if you use models from Hugging Face, you don’t even need to perform the conversion. It runs smoothly on GeForce, Radeon, and CPUs without any complicated settings. I highly recommend it.

2 Likes

Thanks much John! That is very helpful.

1 Like