Generate() returns full prompt plus answer

alexszen · January 24, 2024, 5:53pm

Hi all,

Some models, when generating text, return the prompt + the answer in the output. For instance mistral models and phi-2 have this behavior. Other models such as flan-t5 don’t do that, and the output only contains the generated text, without prepending the prompt. Is there a way to only return the generated text for e.g. mistral or phi-2? I’ve tried some solutions found online (e.g. tiiuae/falcon-40b-instruct · Model returns entire input prompt together with output suggest setting “include_prompt_in_result” or “return_full_text” to False, but these args don’t seem to exist).

Thanks in advance.

Manel · February 19, 2024, 2:50am

Hi @alexszen

You can exclude the prompt part ‘prompt_length’ from the model’s output, then you can pass only the answer to the decoder as follow:

inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(input_ids=inputs['input_ids'])

prompt_length = inputs['input_ids'].shape[1]

answer = tokenizer.decode(outputs[0][prompt_length:])

Topic		Replies	Views
LLMs Return Prompt as Well as Generated Text Beginners	2	1463	June 20, 2024
A standard way to have the `generate` method of the `GenerateMixin` only output the generated tokens Intermediate	0	626	November 19, 2023
Output Includes Input Beginners	3	1851	September 29, 2022
Is this the right way prompt summarization with BART? 🤗Transformers	1	2081	March 18, 2023
Text Generation Returns Repeat or Random Beginners	0	509	August 24, 2023

Generate() returns full prompt plus answer

Related topics