Text generation using custom constraints

et5h · August 25, 2022, 11:42am

Hi everyone,

I’d like to train a Seq2Seq model on predicting a new sequence containing the tokens of the input sequence plus some new tokens (here, token6, token7, token8).

Here, is an abstract example of my task since I’m training not on natural language but on a concept similar to SMILES to represent molecules:

Input: [‘token1’, 'token’2, ‘token3’, ‘token1’, ‘token5’]
Output: [‘token1’, 'token’2, ‘token6’, ‘token3’, ‘token1’, ‘token7’, ‘token8’, ‘token5’]

I wanted to ask if it is possible to contrain the text generation to force the model to use the tokens of the input sequence (here, 2x token1, 1x token2, 1x token3, 1x token5).
In the Generation documentation I found an argument called contraints, but I couldn’t figure out how I can use it.

Thanks in advance!

Topic		Replies	Views
Is it possible to constraint text generation? 🤗Transformers	0	407	March 30, 2023
Generate constraint words within the output sentence and not at its start/end 🤗Transformers	1	1284	September 16, 2022
Add custom constraint in generate() 🤗Transformers	0	578	September 6, 2023
Prohibit GPT-2 from generating some words on a condition 🤗Transformers	7	1116	April 25, 2021
Multi-decoder text generation with BART 🤗Transformers	0	625	June 7, 2021

Text generation using custom constraints

Related topics