Stopping generation before max_new_tokens

ericaltendorf · June 1, 2023, 5:23pm

I’m playing with a variety of LLaMa models, especially some Wizard and Guanaco 4-bit versions. All of them frequently generate text that ends abruptly, as though they hit max_new_tokens and just stopped.

I tried exponential_decay_length_penalty but with limited luck. Maybe I’m using bad settings?

Strangely, I can’t find any discussion of how to configure generation to “find a good stopping point” prior to hitting the brick wall of max_new_tokens. How is this supposed to work?

Topic		Replies	Views
Text generation using LLAMA3 Beginners	0	829	July 24, 2024
Unisloth 4-bit Llama models acting weirdly when used in a Function Beginners	0	165	May 8, 2024
Validation Error: Meta-Llama-3-8B-Instruct Models	3	252	November 19, 2024
Meta-Llama-3-8B-Instruct: "max_new_tokens" is not working for /v1/chat/completions Intermediate	1	818	July 2, 2024
How does the text-generation pipeline know the special stop token? Beginners	8	3170	June 10, 2024

Stopping generation before max_new_tokens

Related topics