Fine-tuning "reasoning" models

acraev · January 23, 2025, 11:28am

Hi everyone,

I am interested in the topic of fine-tuning a reasoning model. I have a specific idea in mind but no code to share for now.

I have a task where the model will likely benefit from inference-time scaling but this task is not math or code. It requires some fine-grained work with sequences (aligning two or more sequences, comparing correspondences, modifying them, etc.).

In this case, I wonder whether it makes more sense to take a model that was already post-trained for a different reasoning task, like Deepseek-R1 model, or would it rather be better to take a pre-trained model and fine-tune for my specific task, and use some inference-time scaling techniques during fine-tuning?

Alanturner2 · January 23, 2025, 1:10pm

That’s an interesting problem!
Since your task involves sequence manipulation and comparison, leveraging a pre-trained model is definitely a good starting point. While Deepseek-R1 is trained for reasoning, its strengths might be more geared towards formal reasoning (math, code). For sequence-based tasks, a model pre-trained on a large text corpus (like a standard LLM) could be more beneficial. You can then fine-tune it specifically for your sequence alignment and comparison task. Inference-time scaling techniques can be applied during fine-tuning to improve performance. This approach allows you to benefit from the general language understanding of the pre-trained model while specializing it for your specific needs.

Topic		Replies	Views
Inference with fine-tuned model Beginners	0	476	July 11, 2023
Separate LM fine tuning and classification head training Beginners	5	1865	July 1, 2021
Strategies for Enhancing LLM's Understanding of a Complex Novel for Improved Question Answering Research	1	1313	January 19, 2024
Adding domain knowledge in LLMs via fine tuning Research	2	5617	July 23, 2023
What is the proper way to do inference using fine-tuned model? Beginners	1	331	October 19, 2020

Fine-tuning "reasoning" models

Related topics