A standard way to have the `generate` method of the `GenerateMixin` only output the generated tokens

nasheed · November 19, 2023, 6:51am

I assume a very common use case like Question Answering would ideally only need to output the generated tokens (essentially discarding the prompt tokens). Is there a standard way to achieve this?

I understand we can use the return_full_text=False parameter in the pipeline object’s __call__ method to achieve this. Is there a way to do it directly on the generate method of the GenerateMixin??

Topic		Replies	Views
Prevent repeat tokens in GPT2LMHeadModel text generation with max_new_tokens=1 Beginners	0	1115	November 19, 2021
How to generate one word and output it instead of all the answers at once, which would take a long time 🤗Transformers	0	450	August 11, 2023
Generate function and stopping criteria - stop when generated entire word (continue if subtoken merely part of word) Beginners	0	2138	March 3, 2023
Generate() returns full prompt plus answer 🤗Transformers	1	6019	February 19, 2024
Using a custom GenerationMixin with T5ForConditionalGeneration 🤗Transformers	0	253	May 14, 2023

A standard way to have the `generate` method of the `GenerateMixin` only output the generated tokens

Related topics