ROUGE score problem

peshangjaafar · October 5, 2022, 3:29pm

ROUGE score metric is not working for non-English(Arabic) language:

!pip install rouge_score
from datasets import load_metric
metric= load_metric("rouge")

pred_str =['السلام عليكم كيف حالك']
label_str=['السلام عليكم صديقي كيف حالك']
metric.add_batch(predictions=pred_str, references=label_str)
metric.compute()

output

{‘rouge1’: AggregateScore(low=Score(precision=0.0, recall=0.0, fmeasure=0.0), mid=Score(precision=0.0, recall=0.0, fmeasure=0.0), high=Score(precision=0.0, recall=0.0, fmeasure=0.0)),
‘rouge2’: AggregateScore(low=Score(precision=0.0, recall=0.0, fmeasure=0.0), mid=Score(precision=0.0, recall=0.0, fmeasure=0.0), high=Score(precision=0.0, recall=0.0, fmeasure=0.0)),
‘rougeL’: AggregateScore(low=Score(precision=0.0, recall=0.0, fmeasure=0.0), mid=Score(precision=0.0, recall=0.0, fmeasure=0.0), high=Score(precision=0.0, recall=0.0, fmeasure=0.0)),
‘rougeLsum’: AggregateScore(low=Score(precision=0.0, recall=0.0, fmeasure=0.0), mid=Score(precision=0.0, recall=0.0, fmeasure=0.0), high=Score(precision=0.0, recall=0.0, fmeasure=0.0))}

peshangjaafar · November 18, 2022, 7:50pm

This problem occurs because the rouge score tokenizer eliminates all non-English characters, we can change for accepting “Arabic, Kurdish, Farsi” by replacing ‘a-z0-9’ --to–> ’ a-z0-9\u0600-\u06ff\u0750-\u077f\ufb50-\ufbc1\ufbd3-\ufd3f\ufd50-\ufd8f\ufd50-\ufd8f\ufe70-\ufefc\uFDF0-\uFDFD.0-9’ .
You can simply run this code before calling the ROUGE metric.

#read the rouge score tokenize file 
fin = open("/opt/conda/lib/python3.7/site-packages/rouge_score/tokenize.py", "rt")
#read file contents to string
data = fin.read()
#replace all occurrences of the required string
data = data.replace('a-z0-9', 'a-z0-9\\u0600-\\u06ff\\u0750-\\u077f\\ufb50-\\ufbc1\\ufbd3-\\ufd3f\\ufd50-\\ufd8f\\ufd50-\\ufd8f\\ufe70-\\ufefc\\uFDF0-\\uFDFD.0-9')
#close the input file
fin.close()
#open the input file in write mode
fin = open("/opt/conda/lib/python3.7/site-packages/rouge_score/tokenize.py", "wt")
#overrite the input file with the resulting data
fin.write(data)
#close the file
fin.close()

for the rouge_score tokenizer path, you can find it by this:

from rouge_score import tokenize
tokenize
#output: the tokenize path

SepehrNoey · August 5, 2023, 4:40pm

Hi, I tried your approach, but it still doesn’t work with my dataset in Persian language. I get all zeroes with my rouge scores. And now I think that it is a bug in hugging face rouge metric, because I got the correct rouge scores by using rouge library in python directly. Hope that hugging face fixes it soon

Topic		Replies	Views
Which tokenizer does "rouge" metric uses under the hood? 🤗Datasets	2	2184	July 11, 2022
Calculating Rouge metric for fine tunning Pegasus 🤗Transformers	0	1772	May 27, 2021
How can I parallelize a metric? 🤗Datasets	3	1151	December 8, 2021
AggregateScore error when computing metric 🤗Transformers	0	88	June 2, 2024
Fine tuning evaluation decode problem Intermediate	0	173	September 8, 2023

ROUGE score problem

Related topics