How to set stopping criteria in model.generate() when a certain word appears
The word I need to stop the generation when found is : [/SENTENCE]
But the model doesn’t generate the word itself, instead, it generates the subwords
[ [/,SEN,TE,NC,E] ]
like this.
corresponding IDs from the tokenizer are,
( Id and subword word)
28792 => [
28748 => /
28759 => SEN
2654 => TE
1197 => NC
28793 => E]
so how can I put the condition in StoppingCriteriaList that I should stop the generation when the [/SENTENCE] found?
Imho if you are fine tuning the model to stop generation at encountering [/sentence] token and it’s generating subwords, you should probably train it for a few more epochs.
You could include the prediction generated during training in the evaluation loop and see when the complete eos token is being generated.