How to monitor both train and validation metrics at the same step?

Mariam · April 18, 2021, 6:55am

hey @lewtun ,
Thanks for your reply. When running the above I get train_results.json and eval_results.json with the following content:
train_results.json
{
“epoch”: 2.0,
“init_mem_cpu_alloc_delta”: 2540351488,
“init_mem_cpu_peaked_delta”: 0,
“init_mem_gpu_alloc_delta”: 266590720,
“init_mem_gpu_peaked_delta”: 0,
“train_mem_cpu_alloc_delta”: 16502784,
“train_mem_cpu_peaked_delta”: 331776,
“train_mem_gpu_alloc_delta”: 822129664,
“train_mem_gpu_peaked_delta”: 3155901440,
“train_runtime”: 18.1442,
“train_samples”: 168,
“train_samples_per_second”: 1.543
}

eval_result.json
{
“epoch”: 2.0,
“eval_samples”: 42,
“exact_match”: 39.02439024390244,
“f1”: 62.318702403264744
}

I would like to have the losses and metrics on both train and eval datasets. at the moment it is only returning one for the evaluation dataset

Topic		Replies	Views
Trainer doesn't show the loss at each step 🤗Transformers	20	36193	May 9, 2024
No loss being logged, when running MLM script (Colab) 🤗Transformers	11	2659	October 14, 2021
Trainer API to log both Training and Validation Metrics 🤗Transformers	2	1731	July 1, 2021
Log losses/metrics with CustomTrainer(Trainer) class in the same frequency as Trainer, with wandb Beginners	8	98	August 6, 2025
Wandb does not display train/eval loss except for last one Beginners	2	3789	March 4, 2022

How to monitor both train and validation metrics at the same step?

Related topics