T5 generates repetitive sentences

tsei902 · May 1, 2023, 1:55pm

My T5 model generates repetitive sentences. I have finetuned the model in teacher forcing manner, giving the model aligned complex and simple sentences (each is one sentence) during training.

During generation, I provide the model with one sentence per generation iteration.

An example of output sentences I get is as follows:
De andere kant van de gewapende conflicten bestaat voornamelijk uit het Soedanese leger en de Janjaweed, een Soedanese militiegroep die voornamelijk is gerekruteerd uit de Afro-Arabische Abbal
Jeddah is de belangrijkste toegangspoort tot Mekka, de heiligste stad van de islam. Het is de belangrijkste toegangspoort tot Mekka. Het is de belangrijkste toegangspoort tot Mekka. Het is de belangrijkste
De Grote Donkere Vlek is een gat in het methaanwolkendek van Neptunus. Het is een gat in het methaanwolkendek van Neptunus. Het is een gat in het methaanwolkendek van
Zijn volgende werk, zaterdag, volgt op een bijzonder veelbewogen dag in zijn leven. Een bijzonder veelbewogen dag. Een neurochirurg is een neurochirurg. Het is een neurochirurg. Het is een neurochirurg. Het is een neurochirurg.

Hence my model is highly repetitive, but the repetition does not occur within one sentence but the model produces several sentences which are all a repetition of the first one.
I find this intriguing since I give the model only single sentences each during trainig (Adafactor optimizer, combination of parameters for the Adafactor optimizer is learning rate = 0.0001, eval batch size = 6 (8 for 10000), epochs = 4, warmup steps = 5, eval loss ~1.41).

Encoding during training is with truncation but without padding.
The generation has the following parameters: max_length = 128, min_length =5.
Can it be that padding is lacking and therefore the output sequence generated tokens up till max_length? My goal is to only produce one output sentence.

Many thanks in advance!

iarbel · July 11, 2023, 9:51am

I’m experiencing the exact same issue with conditional generation. I fine-tuned T5-3B with LoRA (using PEFT) for a task of translating product technical details to descriptions. The training descriptions were fed to the model as sentences, delimited by ‘\n#’. After training, when data is generated I get these chunks of sentences which make sense on their own, but are super repetitive compared to one another. Using ‘repetition_penalty’ only slightly improves this, and comes at the cost of less coherent and longer text outputs.

Did you find any tricks for the training or inference that improved your model’s performance?

abdelmomenn · May 1, 2024, 8:47pm

Hello, did you find a solution for that ?

tsei902 · May 2, 2024, 7:17am

In my case, more data definitely helped but did not solve the issue completely. But there has recently been a lot of literature on repetitiveness, I suggest you have a look there as well!

Topic		Replies	Views
Repeating a word from input a certain number of times as output Models	0	694	August 26, 2020
Force decoder to avoid repetition between generated sentences Beginners	4	3801	April 20, 2021
T5 generate gibberish after finetune 10epochs Models	4	1578	March 2, 2022
Generate sentences from keywords only Beginners	4	3016	November 26, 2021
Text Generation output keep repeat input sentences. Am I missing somethings Beginners	3	924	May 31, 2024

T5 generates repetitive sentences

Related topics