Why is SFT in TRL even though it's not using RL at all

qmsoqm · January 19, 2025, 7:18pm

As far as I know, SFT is basically continued post-training, updating weight by letting the model predict the next tokens. If this is correct, then why is SFT categorized in Transformer Reinforcement Learning API? Am I missing something?

Topic		Replies	Views
What is the essential reason for using sft train? Beginners	0	8	February 3, 2025
Fine tune with SFTTrainer Intermediate	17	14125	September 12, 2024
Finetuning with SFTtrainer Intermediate	1	433	June 12, 2024
Whats happening in the SFT trainer? Beginners	13	2529	January 20, 2025
SFT trainer on non instruction training Beginners	0	92	April 19, 2024

Why is SFT in TRL even though it's not using RL at all

Related topics