Hello, I am writing this post because I have a question about the SFT train. I have a question while doing model engineering and would like to ask the experts. I think that if you train the model with SFT, the model will give an answer based on the loss value. However, is there a reason to do SFT training, which is costly, under the easy and excellent technology called RAG? It seems that RAG would be better at simply providing answers to prepared documents. Other
On the one hand, there is the advantage of setting generate prompts for sft train so that the model can learn accordingly, but I wonder if this can be solved simply by prompt engineering. When doing sft train, it is said that understanding of a specific domain increases, but does this mean that fine-tuned data as background knowledge can increase the model’s understanding? If so, how much training should I do to understand I am also doubtful that I can pull it off. I am sorry for my incoherent writing.
1 Like