What is the essential reason for using sft train?

kimsin · February 3, 2025, 6:24pm

Hello, I am writing this post because I have a question about the SFT train. I have a question while doing model engineering and would like to ask the experts. I think that if you train the model with SFT, the model will give an answer based on the loss value. However, is there a reason to do SFT training, which is costly, under the easy and excellent technology called RAG? It seems that RAG would be better at simply providing answers to prepared documents. Other
On the one hand, there is the advantage of setting generate prompts for sft train so that the model can learn accordingly, but I wonder if this can be solved simply by prompt engineering. When doing sft train, it is said that understanding of a specific domain increases, but does this mean that fine-tuned data as background knowledge can increase the model’s understanding? If so, how much training should I do to understand I am also doubtful that I can pull it off. I am sorry for my incoherent writing.

Topic		Replies	Views
Whats happening in the SFT trainer? Beginners	15	2564	July 16, 2025
Why is SFT in TRL even though it's not using RL at all Beginners	0	23	January 19, 2025
SFT - training on generations only Beginners	0	197	August 30, 2023
Training causal LM from scratch - forcing prompt during training Beginners	0	286	February 11, 2022
Finetuning with SFTtrainer Intermediate	1	433	June 12, 2024

What is the essential reason for using sft train?

Related topics