Calculating Rouge metric for fine tunning Pegasus

MartinEmil · May 27, 2021, 11:53pm

I’ve been fine-tunning pegasus using the trainer class of huggingface.
I tried to implement the rouge metric using this method but every time the following error happens

def compute_metrics(eval_pred):
   predictions, labels = eval_pred
   decoded_preds = tokenizer.batch_decode(predictions, skip_special_tokens=True)
   # Replace -100 in the labels as we can't decode them.
   labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
   decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)

   # Rouge expects a newline after each sentence
   decoded_preds = ["\n".join(nltk.sent_tokenize(pred.strip())) for pred in decoded_preds]
   decoded_labels = ["\n".join(nltk.sent_tokenize(label.strip())) for label in decoded_labels]
   rouge = load_metric('rouge')
   result = rouge.compute(predictions=decoded_preds, references=decoded_labels, use_stemmer=True)
   # Extract a few results
   result = {key: value.mid.fmeasure * 100 for key, value in result.items()}

   # Add mean generated length
   prediction_lens = [np.count_nonzero(pred != tokenizer.pad_token_id) for pred in predictions]
   result["gen_len"] = np.mean(prediction_lens)

   return {k: round(v, 4) for k, v in result.items()}

TypeError: int() argument must be a string, a bytes-like object or a number, not ‘list’

We are using the PegasusForConditionalGeneration transformers class
and according to this post the predictions from the eval_pred parameter is a Tuple. Which I can’t use to get the rouge scores.

We are using this fine tuning script as a base.

Is there anyway around this ?

Topic		Replies	Views
Rouge-L score in Trainer huggingface 🤗Transformers	1	2005	September 25, 2023
AggregateScore error when computing metric 🤗Transformers	0	89	June 2, 2024
Which tokenizer does "rouge" metric uses under the hood? 🤗Datasets	2	2191	July 11, 2022
Trainer class, compute_metrics and EvalPrediction 🤗Transformers	6	14502	October 28, 2020
WARNING:tensorflow:Callback method `on_train_batch_end` is slow compared to the batch time when adding rouge-score Intermediate	0	1574	February 14, 2022

Calculating Rouge metric for fine tunning Pegasus

Related topics