Trouble returning audio from Interference endpoints

pbotsaris · August 14, 2023, 9:43pm

Hi there,

Thanks in advanced for the help. I am trying return audio from a model deployed to an inference endpoint.

I am using a custom handler.py but the inference endpoint does not support an accept header as audio/wav . So we basically need a hack here. I though I would just encoded the wav file into a base64 string and embed the string in the json object. That didn’t work - something in the json marshaling or unmarshaling is corrupting the wave file when in decode in the client.

Then I though, well I am going to send request with the accept as any other binary type like image/jpeg but the handler expects a PIL image as output.

Finally, I tried text/plain and return the base64 string directly but I keep getting this error:

2023/08/14 23:26:55 ~ ERROR | 'NoneType' object has no attribute 'serialize'

Even though when I send a request with accept: image/jpeg I get an error confirming that I am indeed returning the string:

2023/08/14 23:07:26 ~ ERROR | Can only serialize PIL.Image.Image, got <class 'str'>

I know this is a hack, but any ideas?

thanks.

philschmid · August 15, 2023, 5:49am

Hello @pbotsaris,

Thank you for opening the thread. I checked with the team and its correct we are currently not having an automatic serialization on the service for audio.
When you sent */* or json as accept header you get back the audio as base 64 string.
Could you share what you are trying to do so we can prioritize and add this feature.

jakiestfu · February 28, 2024, 5:20pm

@philschmid I am having this issue as well. I would like to send an Accept: 'audio/wav' header, for example, but that is not currently supported.

Currently, there are convenient ways to return various formats here, but there are no audio formats and that is a large part of inference on huggingface.

@pbotsaris Were you able to solve this problem?

Topic		Replies	Views
Text-to-speech inference API doesn't respect accept headers Inference Endpoints on the Hub	4	334	June 6, 2023
Using Inference API with large audio files Beginners	4	1249	September 16, 2022
How to run text to speech from inference endpoint given audio file url? Beginners	1	909	June 8, 2023
How to use the inference api on tts model? Models	14	3001	January 3, 2022
Is it possible to have an inference endpoint return a response that isn't JSON? Inference Endpoints on the Hub	3	119	August 30, 2024

Trouble returning audio from Interference endpoints

Related topics