SFTTrainer loss function and formatting_func

jmassot · December 6, 2025, 4:46pm

I have found the origin of the “apparent” model collapse:

1- the training was good BUT…

2- apparently the SFTTrainer lets the model in its training state. And in my notebook, I did not persist the model before running the evaluation, not even a model.eval().

Result: the model was still in training mode, with dropout and so on —> it was predicting the first token and even looped to it until max length reached.

The HF documentation could be updated to mention this SFTTrainer behavior if not already.

On the Google side we will update the colab material too because it is not indicated.

Thanks everyone for support and help

Jerome

Topic		Replies	Views
Fine tune with SFTTrainer Intermediate	17	15730	September 12, 2024
Whats happening in the SFT trainer? Beginners	15	3044	July 16, 2025
Perhaps your features (`output` in this case) have excessive nesting (inputs type `list` where type `int` is expected) 🤗Transformers	19	874	January 20, 2025
SFTTrainer work but without result Beginners	1	35	October 13, 2025
Dataset format which will be given SFTTrainer 🤗Transformers	0	179	June 16, 2024

SFTTrainer loss function and formatting_func

Related topics