How to set stopping criteria in model.generate() when a certain word appears

How to set stopping criteria in model.generate() when a certain word appears

The word I need to stop the generation when found is : [/SENTENCE]
But the model doesn’t generate the word itself, instead, it generates the subwords
[ [/,SEN,TE,NC,E] ]
like this.

corresponding IDs from the tokenizer are,
( Id and subword word)
28792 => [
28748 => /
28759 => SEN
2654 => TE
1197 => NC
28793 => E]

so how can I put the condition in StoppingCriteriaList that I should stop the generation when the [/SENTENCE] found?

Imho if you are fine tuning the model to stop generation at encountering [/sentence] token and it’s generating subwords, you should probably train it for a few more epochs.

You could include the prediction generated during training in the evaluation loop and see when the complete eos token is being generated.

@Sandy1857 so why we don’t solve this problem by using the transformers.StoppingCriteriaList() class inside model.generate() function

@Pradeep1995 this may be a good solution.

1 Like