hey @lewtun ,
Thanks for your reply. When running the above I get train_results.json and eval_results.json with the following content:
train_results.json
{
“epoch”: 2.0,
“init_mem_cpu_alloc_delta”: 2540351488,
“init_mem_cpu_peaked_delta”: 0,
“init_mem_gpu_alloc_delta”: 266590720,
“init_mem_gpu_peaked_delta”: 0,
“train_mem_cpu_alloc_delta”: 16502784,
“train_mem_cpu_peaked_delta”: 331776,
“train_mem_gpu_alloc_delta”: 822129664,
“train_mem_gpu_peaked_delta”: 3155901440,
“train_runtime”: 18.1442,
“train_samples”: 168,
“train_samples_per_second”: 1.543
}
eval_result.json
{
“epoch”: 2.0,
“eval_samples”: 42,
“exact_match”: 39.02439024390244,
“f1”: 62.318702403264744
}
I would like to have the losses and metrics on both train and eval datasets. at the moment it is only returning one for the evaluation dataset