Hello experts please help on running local DeepSeek-R1-0528-Qwen3-8B

jimxu7 · June 9, 2025, 10:27pm

Has anyone installed GeForce RTX 5070 12GB to accelerate the DeepSeek-R1-0528-Qwen3-8B model? I am trying on a budget, not wanting to buy RTX 5080 16GB for a local setup.

John6666 · June 10, 2025, 4:27am

The 8B model is larger than expected and actually does not fit into 16GB of VRAM. If you quantize it to reduce the size, it can fit into 6GB, but the process is slightly more complicated than not quantizing it, and conversion work is required.
However, it’s not that difficult now.

Using Ollama eliminates most of the setup work, and if you use models from Hugging Face, you don’t even need to perform the conversion. It runs smoothly on GeForce, Radeon, and CPUs without any complicated settings. I highly recommend it.

jimxu7 · June 10, 2025, 2:39pm

Thanks much John! That is very helpful.

Topic		Replies	Views
Deepseek Inference so slow Beginners	3	51	July 12, 2025
Running an LLM with high output quality locally Beginners	5	1847	February 22, 2025
Resource requirement to run DeepSeek R1 7B Models	1	35	May 9, 2025
Problem with launching DeepSeek-R1-Distill-Qwen-32B-Uncensored-Q8_0-GGUF Models	32	459	March 18, 2025
Download DeepSeek R1 685B locally for future fine tuneing 🤗Transformers	2	2021	January 31, 2025

Hello experts please help on running local DeepSeek-R1-0528-Qwen3-8B

Related topics