I assume a very common use case like Question Answering would ideally only need to output the generated tokens (essentially discarding the prompt tokens). Is there a standard way to achieve this?
I understand we can use the
return_full_text=False parameter in the
__call__ method to achieve this. Is there a way to do it directly on the
generate method of the