Ensure the sentence is complete during generation

During generation, I’m using the constraint of max_length to stop if longer sequences are not required. However, I do not want the generation to stop if the sentence is not complete. Is there any reliable way to stop after one sentence has been generated ?

4 Likes

AFAIK, the generation should stop once it generates an end-of-sentence token, if you don’t specify max_length. You can use StoppingCriteria (which you implicitly do by setting max_length) to construct arbitrary constraints on when to stop your generation.

You’re right about EOS token. If I don’t specify max_length parameter, then the model can generate a long text which may stop making sense halfway through or deviates from the context provided. I want the generation to be a bit more natural. Can you please share an example of how StoppingCriteria would work ? Didn’t find the usage example in docs.

You can find some implementations here. And you can search for “stopping_criteria” in generation_utils.py to understand the usage.

I was struggling with this issue today. The easiest way (instead of providing a custom stopping criteria) is to set both min_length and max_length parameters to the same value. This ensures generations of exactly a given length, no shorter or longer.