Using Inference API with large audio files

murdockthedude · May 11, 2022, 9:10pm

Is there currently a way for us to use the Inference API for audio tasks with larger audio files? At the moment it appears limited to very short audio files or otherwise it returns a 413 payload too large error. The docs are silent on this question but our use case requires running inference on long audio files so if this isn’t supported then the API becomes a non-option for us.

Given that the docs specify the request must consist of a binary payload, I’m inclined to think inference on long audio files isn’t presently supported?

We have the API working for a short test file. But a 9 minute FLAC file is getting rejected by the server as too large (and our use-case is up to 60 minute files)

Thought someone here might know! Thanks.

murdockthedude · May 12, 2022, 6:02pm

Anyone have some insight on this?

Thanks!

rbt073 · June 3, 2022, 7:46pm

I’m getting the same 413 message on all Speech Separation models. I tried both 8k and 16k sampling frequency. Hope to get some confirmation about this.

skorkmaz88 · July 14, 2022, 5:22pm

I am facing a similar probelm for image object recognition case

Bobisat12 · September 16, 2022, 10:07am

Hi there,

My name is Julien, from France.

I also try to post large audio files to a speech to text API hosted here, and get the 413 error message.

Did anyone find a solution or get an info from the tech team?

Best Regards,

Julien

Topic		Replies	Views
Support for ASR inference on longer audiofiles or on live transcription? 🤗Transformers	2	469	April 21, 2023
How to use Inference API to perform speech recognition Beginners	1	199	October 12, 2024
Using inference api on model that returns an audio file Models	0	376	November 23, 2021
Can I change Text to Speech Inference API output Beginners	0	48	July 10, 2024
Inference API error with Whisper, return_timestamps parameter Inference Endpoints on the Hub	13	397	April 25, 2025

Using Inference API with large audio files

Related topics