Standardised REST API properties for structured outputs

stevespencer · August 5, 2025, 2:05pm

I’m exploring adding HuggingFace support to my app so my clients can use their own HuggingFace account to use serverless inference endpoints for text generation.

Our app uses the REST API directly and we need to be able to use Chat Completions with JSON Schema based structured outputs (or GBNF grammars, although I suspect that is too specialist).

However when speaking to Nebius it seems that I can only use this when connecting directly to them via a “guided_json” proprietary property in the JSON data POSTed to the chat completions REST API endpoint. It’s unclear whether I can use this when accessing via HuggingFace and using the HF routing. There does not appear to be any documentation on HF regarding cURL access (for example) with structured outputs. There doesn’t appear to be a standard. I’d expect structured outputs to follow the OpenAI Chat Completions specification, which is not “guided_json”, in any kind of HF-presented standardised API.

What’s the guidance, if I wish to use JSON Schema structured outputs with HF routed chat completions?

John6666 · August 6, 2025, 9:55am

response_format argument might be useful.

stevespencer · August 6, 2025, 12:31pm

Yes, clearly using the OpenAI assumed standard for (albeit deprecated) Chat Completions of response_format is the way to go:

"response_format": {
      "type": "json_schema",
      "json_schema":  { ... }
}

Note that nebius appear to support response_format for json mode (arbitrary json, as per OpenAI, below) but use different guided_json` for structured.

"response_format": { 
    "type": "json_object" 
}

Topic		Replies	Views
Serverless Inference API doesn't seem to support a dedicated JSON mode Inference Endpoints on the Hub	0	234	June 23, 2024
Structured output in Typescript SDK Beginners	1	36	August 11, 2025
Different parameters between JSON inference and Inference API 🤗Hub	0	1397	March 9, 2022
Trouble with the built in inference API example Beginners	2	936	June 19, 2021
Serverless inference issues for a new Go library Inference Endpoints on the Hub	4	83	March 18, 2025

Standardised REST API properties for structured outputs

Related topics