Basically, I’m looking to implement a tiny custom version of what langchain does.
To do that, I need to know what the first token the model produces is, so that I can dynamically adjust the prompt.
My question is, since this is a language model that is interacting with people, ideally I’d be using beam search. How does beam search handle stopping_criteria? Do I need to take anything special into account before trying?