I am using chat completion with Microsofts SemanticKernel and all my responses are truncated to 100 tokens. (No matter what I have max_new_tokens set to.)
I have two questions, can I raise the bar above 100 tokens and is there a way to programmatically detect that the response are truncated?