SFTTrainer Loss function

The loss function being used is the cross-entropy loss. It is defined within the model, e.g. here for llama. In case you want to use your own custom loss function, you can overwrite the compute_loss method of the Trainer as explained here.

  1. For training an LLM the loss function is computed on the whole concatenated prompts, how to alter this and make loss function only compute on the output prompts

Indeed as recommended above, the DataCollatorForCompletionOnlyLM can be used for this purpose.

2 Likes