Standardised REST API properties for structured outputs

I’m exploring adding HuggingFace support to my app so my clients can use their own HuggingFace account to use serverless inference endpoints for text generation.

Our app uses the REST API directly and we need to be able to use Chat Completions with JSON Schema based structured outputs (or GBNF grammars, although I suspect that is too specialist).

However when speaking to Nebius it seems that I can only use this when connecting directly to them via a “guided_json” proprietary property in the JSON data POSTed to the chat completions REST API endpoint. It’s unclear whether I can use this when accessing via HuggingFace and using the HF routing. There does not appear to be any documentation on HF regarding cURL access (for example) with structured outputs. There doesn’t appear to be a standard. I’d expect structured outputs to follow the OpenAI Chat Completions specification, which is not “guided_json”, in any kind of HF-presented standardised API.

What’s the guidance, if I wish to use JSON Schema structured outputs with HF routed chat completions?

1 Like

response_format argument might be useful.

Yes, clearly using the OpenAI assumed standard for (albeit deprecated) Chat Completions of response_format is the way to go:

"response_format": {
      "type": "json_schema",
      "json_schema":  { ... }
}

Note that nebius appear to support response_format for json mode (arbitrary json, as per OpenAI, below) but use different guided_json` for structured.

"response_format": { 
    "type": "json_object" 
}
1 Like