Plot Loss Curve with Trainer()


I am fine tuning a BERT model for a Multiclass Classification problem. While training my losses seem to look a bit “unhealthy” as my validation loss is always smaller (eval_steps=20) than my training loss. How can I plot a loss curve with a Trainer() model?


Scott from Weights & Biases here. Don’t want to be spammy so will delete this if it’s not helpful. You can plot losses to W&B by passing report_to to TrainingArguments.

from transformers import TrainingArguments, Trainer

args = TrainingArguments(... , report_to="wandb")
trainer = Trainer(... , args=args)

More info here: Logging & Experiment tracking with W&B


Hey Scott,
I think its helpful but I already do that. Anyway I want to find a way to directly plot the Losses in my notebook… . Any idea how to achieve that? Cheers

Note that validation losses being smaller than train is not necessarily bad or weird when working with advanced architectures and techniques, since you are not really comparing equivalent things. For example, consider dropout, that “cancels” some connections at train, while using all during evaluation (validation).

1 Like

I trained a few other Bert models and it seems that all models need a few steps (up to 50) till the train loss becomes lower compared to the validation loss. Even with different random states etc. Do you think I do not really have to worry? I mean after those “starting problems” the losses behave normal/healthy for my taste (0.3 vs 0.6 when finished with early stopping)

I obviously can’t say! But the fact that val loss is lower than train would not be a big concern to me! How those losses evolve seems more important. And of course if the model performance actually improves with time, that’s also more relevant! (You can see this in downstream tasks if training a language model).

1 Like

You should be able to use %%wandb at the beginning of your training loop cell to see the live graphs in the output.
See: Tracking Jupyter Notebooks

1 Like

Hey scottire, is it possible for me to obtain the training metrics and load them into a pandas dataframe? I’m looking to plot these scores in matplotlib so that I can compare with models trained with other frameworks.

Also, for using wandb is there a way for me to view the plot against epochs rather than steps?


Yep that’s possible with the wandb API, see here:

import pandas as pd 
import wandb

api = wandb.Api()
entity, project = "<entity>", "<project>"  # set to your entity and project 
runs = api.runs(entity + "/" + project) 

summary_list, config_list, name_list = [], [], []
for run in runs: 
    # .summary contains the output keys/values for metrics like accuracy.
    #  We call ._json_dict to omit large files 

    # .config contains the hyperparameters.
    #  We remove special values that start with _.
        {k: v for k,v in run.config.items()
         if not k.startswith('_')})

    # .name is the human-readable name of the run.

runs_df = pd.DataFrame({
    "summary": summary_list,
    "config": config_list,
    "name": name_list


Super cool! Thank you so much