Trouble returning audio from Interference endpoints

Hi there,

Thanks in advanced for the help. I am trying return audio from a model deployed to an inference endpoint.

I am using a custom handler.py but the inference endpoint does not support an accept header as audio/wav . So we basically need a hack here. I though I would just encoded the wav file into a base64 string and embed the string in the json object. That didn’t work - something in the json marshaling or unmarshaling is corrupting the wave file when in decode in the client.

Then I though, well I am going to send request with the accept as any other binary type like image/jpeg but the handler expects a PIL image as output.

Finally, I tried text/plain and return the base64 string directly but I keep getting this error:

2023/08/14 23:26:55 ~ ERROR | 'NoneType' object has no attribute 'serialize'

Even though when I send a request with accept: image/jpeg I get an error confirming that I am indeed returning the string:

2023/08/14 23:07:26 ~ ERROR | Can only serialize PIL.Image.Image, got <class 'str'>

I know this is a hack, but any ideas?

thanks.

Hello @pbotsaris,

Thank you for opening the thread. I checked with the team and its correct we are currently not having an automatic serialization on the service for audio.
When you sent */* or json as accept header you get back the audio as base 64 string.
Could you share what you are trying to do so we can prioritize and add this feature.

@philschmid I am having this issue as well. I would like to send an Accept: 'audio/wav' header, for example, but that is not currently supported.

Currently, there are convenient ways to return various formats here, but there are no audio formats and that is a large part of inference on huggingface.

@pbotsaris Were you able to solve this problem?