I’m writing a new library in Go using the serverless inference API and I hit a few problems:
- The documentation at Chat Completion is very focused on the python library, and doesn’t list much for the REST API. to the point that the URL format to use isn’t even listed. I use
"https://router.huggingface.co/hf-inference/models/" + model + "/v1/chat/completions"
. I do not need OpenAI compatibility, whatever is closest to native implementation is better for me. - When I make a mistake, I get a whole HTML page with
<h1>503</h1>
instead of an error message in JSON. That’s really hurting my progress. It seems there’s a reverse proxxy on the router that is eating the error messages. - I failed to create a test example that works with JSON schema for structured reply. What example (in any language) would you point me to? I see that Célina and Lucain recently updated the test case test_chat_completion_with_response_format() and it’s now skipped. huggingface_hub/tests/test_inference_client.py at main · huggingface/huggingface_hub · GitHub