I have a model that I’m using through an inference endpoint. When using it locally, I implemented a stopping criteria.
Is there a way to embed the stopping criteria in the model deployed through the inference endpoint ?
Thanks
I have a model that I’m using through an inference endpoint. When using it locally, I implemented a stopping criteria.
Is there a way to embed the stopping criteria in the model deployed through the inference endpoint ?
Thanks
Did you ever get to figure this out? I have a similar situation and would love to know how you handled this.
In case it’s of help for others, the answer is here: Deploy LLMs with Hugging Face Inference Endpoints
This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.