How to get the score for a generated sentence from BartForConditionalGeneration

I am using the trainer class to generate sentences/summaries for some textual input. Here is the piece of code:

if training_args.do_predict:"*** Predict ***")
        predict_results = trainer.predict(
            predict_dataset, metric_key_prefix="predict", max_length=max_length, num_beams=num_beams
        metrics = predict_results.metrics
        max_predict_samples = (
            data_args.max_predict_samples if data_args.max_predict_samples is not None else len(predict_dataset)
        metrics["predict_samples"] = min(max_predict_samples, len(predict_dataset))
        trainer.log_metrics("predict", metrics)
        trainer.save_metrics("predict", metrics)

        if trainer.is_world_process_zero():
            if training_args.predict_with_generate:
                predictions = tokenizer.batch_decode(
                    predict_results.predictions, skip_special_tokens=True, clean_up_tokenization_spaces=True
                predictions = [pred.strip() for pred in predictions]
                output_prediction_file = os.path.join(training_args.output_dir, "generated_predictions.txt")
                with open(output_prediction_file, "w") as writer:

When I look at the loss that is saved, this is what I see:

{‘predict_loss’: 9.717998504638672, ‘predict_rouge1’: 27.2727, ‘predict_rouge2’: 15.0, ‘predict_rougeL’: 27.2727, ‘predict_rougeLsum’: 27.2727, ‘predict_gen_len’: 18.0, ‘predict_runtime’: 0.7654, ‘predict_samples_per_second’: 2.613, ‘predict_steps_per_second’: 2.613, ‘predict_samples’: 2}

I don’t understand what this metric does since I am only generating summaries based on some textual input, and I don’t have any ground truth associated with each sample data. What is the Rouge calculated from? I was expecting a score for each sample (in my case 2) to be output. How can I get that score from the predicted results? I have looked through the predict function in the trainer class, and I don’t see anyway of extracting the info.

I know the model.generate has output_scores=True that I can use, but if I am using the above code, and do predict, why can’t I get the score?