Exclude words from GPT-2 generate( )

deathcrush · March 19, 2021, 7:32pm

Hello!

I seems that this functionality is supported using the bad_words_ids input to the generate API. The docs briefly describe that you need to find a list of integers for the words you care about using the tokenizer and then simply pass those to generate:

**bad_words_ids** ( `List[List[int]]` , optional) – List of token ids that are not allowed to be generated. In order to get the tokens of the words that should not appear in the generated text, use `tokenizer(bad_word, add_prefix_space=True).input_ids` .

I hope this helps!
``

Topic		Replies	Views
GPT2: many bad_words_ids leading to slow text generation? Intermediate	0	1541	September 4, 2021
Get vocabulary tokens in order to exclude them from generate function 🤗Tokenizers	2	2648	August 1, 2022
Good word list in generate function 🤗Transformers	1	623	March 23, 2023
Prohibit GPT-2 from generating some words on a condition 🤗Transformers	7	1112	April 25, 2021
Logit Bias for Transformers? Suppressing unwanted tokens in output Beginners	1	3369	March 22, 2023

Exclude words from GPT-2 generate( )

Related topics