Stopping criteria

I have a model that I’m using through an inference endpoint. When using it locally, I implemented a stopping criteria.

Is there a way to embed the stopping criteria in the model deployed through the inference endpoint ?


Did you ever get to figure this out? I have a similar situation and would love to know how you handled this.

In case it’s of help for others, the answer is here: Deploy LLMs with Hugging Face Inference Endpoints