I’m using a text generation pipeline to generate text after some prompt. My chatbot requires me to have a prompt of say 64 tokens and to generate a maximum length of 32 tokens. If I set max_length=32
it tells me “The context has 64 number of tokens, but max_length
is only 32.” If I set max_length=32+64
, it generates 96 tokens on top of my context, so in total 64+96=160 tokens. How can I generate only 32 tokens on top of my prompt?
length = len(prompt)
max_length = 32 + length