How to interpret fine-tuned model results and use model

Hello all,

I fine-tuned a seq2seq model on my custom dataset using the tutorial found here: transformers/examples/seq2seq at master · huggingface/transformers · GitHub

I am trying to find out the F1 and EM accuracy for the fine-tuned model, but am not sure how to interpret the output. I’ve attached a link to the training’s output below:

{
    "epoch": 3.0,
    "eval_gen_len": 55.7429,
    "eval_loss": 2.063843250274658,
    "eval_mem_cpu_alloc_delta": 1998448,
    "eval_mem_cpu_peaked_delta": 638828,
    "eval_rouge1": 33.8505,
    "eval_rouge2": 13.1365,
    "eval_rougeL": 27.8332,
    "eval_rougeLsum": 31.5921,
    "eval_runtime": 119.8097,
    "eval_samples": 35,
    "eval_samples_per_second": 0.292
}

Can you point me to documentation about how to interpret these results and how I can load my fine-tuned model in order to evaluate it on a new piece of text?

Thanks for your help,
–Zak