Transformers - repetition_penalty parameter

OTarumi · June 17, 2023, 10:14pm

Hello everyone,

I am currently working on a project in which I need to translate text from japanese to english. I am using MarianMT pretrained model. My problem is that, sometimes the translated text repeat itself.

I’ve used the repetition_penalty=1.5 parameter to stop this effect, it seems to works fine for the moment.
I haven’t had enough time to go through my entire dataset and see if this parameter causes translation problems for inputs that are repeated and that I want to avoid shortening.

Are there any cons to this method?

dblakely · June 19, 2023, 3:57pm

It usually works well, but it is a bit of a blunt instrument. For example, it penalizes every token that’s repeating, even tokens in the middle/end of a word, stopwords, and punctuation. If the rep penalty is high, this can result in funky outputs.

For example, if you take the sentence: “The United States (U.S.) is the world’s third-largest country and is the largest country in the Americas.” you get tokens like this (I used the T5Tokenizer, but any tokenizer will have similar outputs):

['▁The', '▁United', '▁States', '▁(', 'U', '.', 'S', '.', ')', '▁is', '▁the', '▁third', '-', 'large', 's', 't', '▁country', '▁in', '▁the', '▁world', '▁and', '▁is', '▁the', '▁largest', '▁country', '▁in', '▁the', '▁America', 's', '.']

In this example, the repetition penalty will penalize the “s” in “the Americas” (because it already saw an “s” token). If the repetition penalty is high, the model could end up writing something weird like “… the largest country in the America”.

From my experience, a rep penalty of 1.5 is high enough that you very well might see stuff like this happen. I’d say you should proofread a bunch of your model’s outputs and lower the rep penalty if you do.

OTarumi · July 1, 2023, 12:59am

I see better, thank you for the details

jfwreinhardt · April 4, 2025, 1:42pm

I found that it’s helpful to sometimes set a value less than 1 (0.8 for example). This works with the new Gemma 3 model

Topic		Replies	Views
Repetitions after pre-training T5X 🤗Transformers	0	131	June 18, 2024
How to set generate parameters in fine-tuning 🤗Transformers	1	751	October 12, 2023
Understanding repetition_penalty in LLaMA-2 Pretrained Model Models	0	5288	December 17, 2023
Issues with translating inputs containing repeated phrases 🤗Transformers	1	1530	September 9, 2020
Prevent repeat tokens in GPT2LMHeadModel text generation with max_new_tokens=1 Beginners	0	1116	November 19, 2021

Transformers - repetition_penalty parameter

Related topics