Fine-tune with SFTTrainer

user7896 · August 8, 2024, 1:46pm

When we use a sequence of messages for training (e.g. user message #1, assistant message #1, user message #2, assistant message #2), is the model trained to generate only the last message (i.e. assistant message #2) of the assistant in the sequence, and all previous messages are used as context, or is the model trained to generate each assistant message separately (i.e. both assistant message #1 and assistant message #2)? That is, what is the target of the model?

Topic		Replies	Views
Questions about ordering training inputs when fine-tuning models Beginners	5	2478	December 4, 2023
How is the prompt + answer handled during training Beginners	0	112	March 20, 2024
SFT - training on generations only Beginners	0	197	August 30, 2023
Structuring chat histories while also mitigating more than one chatbot response 🤗Datasets	0	398	December 16, 2023
Whats happening in the SFT trainer? Beginners	14	2541	July 15, 2025

Fine-tune with SFTTrainer

Related topics