How to Log Training Loss at Step Zero in Hugging Face Trainer or SFT Trainer?

’m using the Hugging Face Trainer (or SFTTrainer) for fine-tuning, and I want to log the training loss at step 0 (before any training steps are executed). I know there’s an eval_on_start option for evaluation, but I couldn’t find a direct equivalent for training loss logging at the beginning of training.

Is there a way to log the initial training loss at step zero (before any updates) using Trainer or SFTTrainer? Ideally, I’d like something similar to eval_on_start.

Here’s what I’ve tried so far:

Solution 1: Custom Callback

I implemented a custom callback to log the training loss at the start of training:

from transformers import TrainerCallback

class TrainOnStartCallback(TrainerCallback):
    def on_train_begin(self, args, state, control, logs=None, **kwargs):
        # Log training loss at step 0
        logs = logs or {}
        logs["train/loss"] = None  # Replace None with an initial value if available
        logs["train/global_step"] = 0
        self.log(logs)

    def log(self, logs):
        print(f"Logging at start: {logs}")
        wandb.log(logs)

# Adding the callback to the Trainer
trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    args=training_args,
    optimizers=(optimizer, scheduler),
    callbacks=[TrainOnStartCallback()],
)

This works but feels a bit overkill. It logs metrics at the start of training before any steps.

Solution 2: Manual Logging

Alternatively, I manually log the training loss before starting training:

wandb.log({"train/loss": None, "train/global_step": 0})
trainer.train()

Question:

Are there any built-in features in Trainer or SFTTrainer to log training loss at step zero? Or is a custom callback or manual logging the best solution here? If so, are there better ways to implement this functionality? similar to the eval_on_start but train_on_start?


1 Like

what about: logging_first_step? logging_first_step (bool, *optional*, defaults to False): Whether to log the first global_step or not.

?

btw, ref: How to Log Training Loss at Step Zero in Hugging Face Trainer or SFT Trainer? · Issue #34981 · huggingface/transformers · GitHub

1 Like