It usually works well, but it is a bit of a blunt instrument. For example, it penalizes every token that’s repeating, even tokens in the middle/end of a word, stopwords, and punctuation. If the rep penalty is high, this can result in funky outputs.
For example, if you take the sentence: “The United States (U.S.) is the world’s third-largest country and is the largest country in the Americas.” you get tokens like this (I used the T5Tokenizer, but any tokenizer will have similar outputs):
['▁The', '▁United', '▁States', '▁(', 'U', '.', 'S', '.', ')', '▁is', '▁the', '▁third', '-', 'large', 's', 't', '▁country', '▁in', '▁the', '▁world', '▁and', '▁is', '▁the', '▁largest', '▁country', '▁in', '▁the', '▁America', 's', '.']
In this example, the repetition penalty will penalize the “s” in “the Americas” (because it already saw an “s” token). If the repetition penalty is high, the model could end up writing something weird like “… the largest country in the America”.
From my experience, a rep penalty of 1.5 is high enough that you very well might see stuff like this happen. I’d say you should proofread a bunch of your model’s outputs and lower the rep penalty if you do.