I noticed that, according to the trainer’s documentation, when fine-tuning the model, I am required to provide a text field ( trl/trl/trainer/sft_trainer.py at 18a33ffcd3a576f809b6543a710e989333428bd3 · huggingface/trl · GitHub ). However, this does not seem to be a supervised task! Upon further exam…

Hi, So SFT (supervised fine-tuning) is called supervised since we’re collecting the data from humans. However we’re still training the model using the same cross-entropy loss as during pre-training (i.e. predicting the next token). We now just make it more likely that the model will generate a use…

Fine tune with SFTTrainer

Intermediate

nielsr June 12, 2024, 2:16pm 8

That looks like an issue with data preparation. Are you using the tokenizer to prepare data for the model?

Finetuning with SFTtrainer

Topic		Replies	Views
Instruction tuning llm Beginners	8	11871	May 8, 2024
Finetuning with SFTtrainer Intermediate	1	406	June 12, 2024
Fine-tuning queries Beginners	0	34	February 20, 2025
[LMM Fine Tuning] Supervised Fine Tuning Trainer (SFTTrainer) vs transformers Trainer Intermediate	1	1659	November 29, 2023
SFT Trainer and chat templates Beginners	3	161	March 26, 2025

Fine tune with SFTTrainer

Related topics