Fine Tuning DeepSeek v3?

Hi,

So i have a gradio space running with DeepSeek V3 using L4 GPU resources. Its running great.

Is it possible to fine tune DeepSeek V3 using several thousand, or even just a few hundred, pages of pdf information? Almost entirely just text on these pdf files.

If possible:
If so, what are things I should consider and what would be the smart, effective way of going about this?

Thanks everyone!

1 Like

While it is not impossible to fine-tune the full version of DeepSeek V3, it would require significant hardware resources.
However, when fine-tuning the distilled version of DeepSeek using LoRA or QLoRA, it can be done with significantly fewer resources. Of course, the accuracy will be lower, so it depends on the specific use case.

Additionally, if the goal is simply to accurately handle the content of documents, a RAG-based approach is often more cost-effective than fine-tuning.