Is it possible to have an inference endpoint return a response that isn't JSON?

zachmullen · August 28, 2024, 7:28pm

Hello, I’m using an inference endpoint for an image segmentation task (i.e. a UNET under the hood) and the output of the pipeline is a segmentation mask, i.e. an image. I’d like to have my endpoint simply return this image as the HTTP request body with some reasonable Content-Type header, but from the docs it’s not clear if anything besides JSON serialization is supported. When I tried returning a bytes object from my handler, for instance, it failed.

I know I could always base64 encode the data and use JSON, but I’d prefer to avoid that additional overhead if it’s possible.

Thanks!

John6666 · August 29, 2024, 12:39am

Basically, it should be possible. (For example we routinely do so from Gradio in Spaces)
I’d say the documentation is there, but it’s scattered…

github.com

gradio-app/gradio/blob/main/gradio/external.py

"""This module should not be used directly as its API is subject to change. Instead,
use the `gr.Blocks.load()` or `gr.load()` functions."""

from __future__ import annotations

import json
import os
import re
import tempfile
import warnings
from pathlib import Path
from typing import TYPE_CHECKING, Callable, Literal

import httpx
import huggingface_hub
from gradio_client import Client
from gradio_client.client import Endpoint
from gradio_client.documentation import document
from packaging import version

This file has been truncated. show original

After all, in HF it’s often faster to read the code and try it out than to read the documentation diligently from cover to cover.
I’m not a hacker or anything…

zachmullen · August 29, 2024, 5:03pm

I tried to reverse engineer this a bit since it’s not that well documented, and attempted to use the Accept header to indicate that the response should be returned raw. However, it looks like a raw response isn’t one of the supported auto-serialization modes. The error message was:

('\n'
 '                Accept type "application/octet-stream" not supported.\n'
 '                Supported accept types are:\n'
 '                application/json, text/csv, text/plain, image/png, '
 'image/jpeg, image/jpg, image/tiff, image/bmp, image/gif, image/webp, '
 'image/x-image, audio/x-flac, audio/flac, audio/mpeg, audio/x-mpeg-3, '
 'audio/wave, audio/wav, audio/x-wav, audio/ogg, audio/x-audio, audio/webm, '
 'audio/webm;codecs=opus, audio/AMR, audio/amr, audio/AMR-WB, audio/AMR-WB+, '
 'audio/m4a, audio/x-m4a\n'
 '            ')

From what I can tell from that list, there isn’t a way to indicate that the response should be returned raw as the body. I tried text/plain but that did not work, and it looked like it was looking for some custom serialization logic that didn’t exist.

John6666 · August 30, 2024, 2:26am

Well, I guess it’s faster to receive it as image than as JSON.
No, it’s not faster…?
Anyway, for example, Diffusers’ documentation recommended that images be received and saved with PIL.Image.

HF documentation is inevitably auto-generated from comments in the code, so except for the introductory sections, it is faster to read the code if you can read it.
It is even faster to imitate someone else who is doing it well.

Topic		Replies	Views
Create API Endpoint from hugging face space Spaces	0	1342	June 11, 2024
Trouble returning audio from Interference endpoints Inference Endpoints on the Hub	2	351	February 28, 2024
Inference Endpoints - No working code examples Inference Endpoints on the Hub	3	146	January 29, 2025
Text-to-speech inference API doesn't respect accept headers Inference Endpoints on the Hub	4	304	June 6, 2023
Obtaining more items via Inference Endpoints API Beginners	0	187	February 13, 2023

Is it possible to have an inference endpoint return a response that isn't JSON?

Related topics