How to Log Training Loss at Step Zero in Hugging Face Trainer or SFT Trainer?

brando · November 28, 2024, 12:22am

’m using the Hugging Face Trainer (or SFTTrainer) for fine-tuning, and I want to log the training loss at step 0 (before any training steps are executed). I know there’s an eval_on_start option for evaluation, but I couldn’t find a direct equivalent for training loss logging at the beginning of training.

Is there a way to log the initial training loss at step zero (before any updates) using Trainer or SFTTrainer? Ideally, I’d like something similar to eval_on_start.

Here’s what I’ve tried so far:

Solution 1: Custom Callback

I implemented a custom callback to log the training loss at the start of training:

from transformers import TrainerCallback

class TrainOnStartCallback(TrainerCallback):
    def on_train_begin(self, args, state, control, logs=None, **kwargs):
        # Log training loss at step 0
        logs = logs or {}
        logs["train/loss"] = None  # Replace None with an initial value if available
        logs["train/global_step"] = 0
        self.log(logs)

    def log(self, logs):
        print(f"Logging at start: {logs}")
        wandb.log(logs)

# Adding the callback to the Trainer
trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    args=training_args,
    optimizers=(optimizer, scheduler),
    callbacks=[TrainOnStartCallback()],
)

This works but feels a bit overkill. It logs metrics at the start of training before any steps.

Solution 2: Manual Logging

Alternatively, I manually log the training loss before starting training:

wandb.log({"train/loss": None, "train/global_step": 0})
trainer.train()

Question:

Are there any built-in features in Trainer or SFTTrainer to log training loss at step zero? Or is a custom callback or manual logging the best solution here? If so, are there better ways to implement this functionality? similar to the eval_on_start but train_on_start?

github.com/huggingface/transformers

How to Log Training Loss at Step Zero in Hugging Face Trainer or SFT Trainer?

opened 12:24AM - 28 Nov 24 UTC

brando90

Feature request

### Feature request log train loss on start ---- ’m using the Hugging Fac…e `Trainer` (or `SFTTrainer`) for fine-tuning, and I want to log the training loss at step 0 (before any training steps are executed). I know there’s an `eval_on_start` option for evaluation, but I couldn't find a direct equivalent for training loss logging at the beginning of training. Is there a way to log the initial training loss at step zero (before any updates) using `Trainer` or `SFTTrainer`? Ideally, I'd like something similar to `eval_on_start`. Here’s what I’ve tried so far: #### Solution 1: Custom Callback I implemented a custom callback to log the training loss at the start of training: ```python from transformers import TrainerCallback class TrainOnStartCallback(TrainerCallback): def on_train_begin(self, args, state, control, logs=None, **kwargs): # Log training loss at step 0 logs = logs or {} logs["train/loss"] = None # Replace None with an initial value if available logs["train/global_step"] = 0 self.log(logs) def log(self, logs): print(f"Logging at start: {logs}") wandb.log(logs) # Adding the callback to the Trainer trainer = SFTTrainer( model=model, tokenizer=tokenizer, train_dataset=train_dataset, eval_dataset=eval_dataset, args=training_args, optimizers=(optimizer, scheduler), callbacks=[TrainOnStartCallback()], ) ``` This works but feels a bit overkill. It logs metrics at the start of training before any steps. #### Solution 2: Manual Logging Alternatively, I manually log the training loss before starting training: ```python wandb.log({"train/loss": None, "train/global_step": 0}) trainer.train() ``` ### Question: Are there any built-in features in `Trainer` or `SFTTrainer` to log training loss at step zero? Or is a custom callback or manual logging the best solution here? If so, are there better ways to implement this functionality? similar to the `eval_on_start` but `train_on_start`? cross: https://discuss.huggingface.co/t/how-to-log-training-loss-at-step-zero-in-hugging-face-trainer-or-sft-trainer/128188 ### Motivation Crucial sanity check ### Your contribution yes, happy to implement this.

brando · November 28, 2024, 10:13pm

what about: logging_first_step? logging_first_step (bool, *optional*, defaults to False): Whether to log the first global_step or not.

?

btw, ref: How to Log Training Loss at Step Zero in Hugging Face Trainer or SFT Trainer? · Issue #34981 · huggingface/transformers · GitHub

Topic		Replies	Views
How to evaluate before first training step? 🤗Transformers	10	6613	August 3, 2024
Trainer log my custom metrics at training step Beginners	3	4083	July 11, 2024
Logs of training and validation loss Beginners	10	32750	February 14, 2025
No log for validation loss in trainer.train() Beginners	4	6118	April 13, 2024
Getting No log in validation_loss 🤗Transformers	3	270	February 14, 2025

How to Log Training Loss at Step Zero in Hugging Face Trainer or SFT Trainer?

Solution 1: Custom Callback

Solution 2: Manual Logging

Question:

Related topics