How to limit response to generated output only? Using ChatML

danielritchie · May 14, 2024, 7:44pm

I’m new to using chatml, but I have successfully generated the response that I want.

However, I am struggling in my attempts to limit the response to the generated content.

Example:

You are helpful robot.
<|im_end|>
<|im_start|>user
Mary had a little ...
<|im_end|>
<|im_start|>assistant

… will produce a response that looks something like:

You are helpful robot.
<|im_end|>
<|im_start|>user
Mary had a little ...
<|im_end|>
<|im_start|>assistant
Mary had a little lamb.

What I am hoping to do is for the response to be limited to only the generated output, without any of the input e.g.,:

lamb.

I can of course handle this after I get the response back, but for a variety of reasons I’d like to limit the response if possible (mostly because the input context is massive). I’ve found a few suggestions online but nothing seems to be working.

Also, I haven’t been able to find any definitive overview or documentation regarding ChatML… so maybe this is already clarified in docs that I’m just not aware of?

danielritchie · May 15, 2024, 3:59am

Have learned this is a known bug in the specific model I am using.

Still haven’t found a workaround but will update if I do.

danielritchie · May 15, 2024, 2:36pm

I set return_full_text = False and it worked

CampbellDorsey · May 15, 2024, 5:47pm

do you think it is possible to set a limit on the number of characters in the response? For example, what if you set a limit of 10,000 characters? I’ll try to solve this problem too, and if it works, I’ll give you an answer.

KeziaWahome · July 5, 2024, 9:25am

With chat models this is tricky, I have been trying to limit the generated output without having to reduce the max output tokens

I have noted with instruct models, you can add the instructions and limits you want on the prompt instructions , however for just base chat models … I still doing research on this

I am testing out Granite Models

Topic		Replies	Views
Incomplete/ partial response generation Models	3	1424	March 27, 2024
Change length of GPT-neo output Beginners	6	1890	June 10, 2021
Inference API detailed request Beginners	5	2327	September 11, 2020
How to set minimum length of generated text in hosted API Beginners	2	1602	March 10, 2021
Strange answer from api 🤗Transformers	0	629	January 10, 2022

How to limit response to generated output only? Using ChatML

Related topics