Prevent repeat tokens in GPT2LMHeadModel text generation with max_new_tokens=1

andrewl · November 19, 2021, 9:02pm

Hi! I’m currently exploring some of the transformer libs capabilities and had a question about the model.generate() method.

I’m using some implementation like this:

    output_sequences = model.generate(
        input_ids=input_ids,
        top_k=40,
        top_p=0.9,
        max_new_tokens=1,
        do_sample=True,
        num_return_sequences=25,
        return_dict_in_generate=True,
        output_scores=True
    )

    predictions = [
        dict(
            w = tokenizer.decode(output_sequences.sequences[i][-1]),
            p = ...score calculation...
    ) for i in range(output_sequences["sequences"].shape[0])]

So that given some prompt I get such a response:

input_prompt = "Hello there, how are"
#...Tokenize inputs...
#...generate...
predictions = [
    { w: " you", p: some_score1 }
    { w: " things", p: some_score2 }
    { w: " you", p: some_score3 }
    #...etc
]

The only issue is that I get repeat tokens as shown above… Is there any way to ensure I receive unique tokens back from the generation? I did not have this issue when implementing beam search, but this method is too slow for my application.

Topic		Replies	Views
Generating Once for 16 Tokens is Not Same Generating Single Token 16 Times? 🤗Transformers	4	279	April 17, 2024
Repetitive words in model output Models	1	48	December 18, 2024
Text Generation output keep repeat input sentences. Am I missing somethings Beginners	3	924	May 31, 2024
Text Generation, adding random words, weird linebreaks & symbols at random Beginners	5	982	May 24, 2021
How can i skip GPT2LMHeadModel embedding layers? 🤗Transformers	4	1022	February 17, 2023

Prevent repeat tokens in GPT2LMHeadModel text generation with max_new_tokens=1

Related topics