How to use Inference API to perform speech recognition

kod33o · October 12, 2024, 4:44pm

Hi everyone.
I am working on a custom project that requires speech recognition.
I need to set up speech recognition with whisper, but my project is just for internal use of my family.

In a realistic scenario, I am not making more than 100 requests per-day, and each request would use the model for 1-10 minutes.

I set up an Inference Endpoint, thinking I would be billed only when my requests happen, but the model appears to always be in a “Running” state, so I am just billed 0.50 per hour since I created it which is waaaaay too much for me, completely unbearable.

So I tried to use the API:

import os

import requests
from dotenv import load_dotenv

load_dotenv()


API_TOKEN = os.environ.get('HF_TOKEN')

audio_file_path = "sample_data/10_min_meeting.wav"

headers = {"Authorization": f"Bearer {API_TOKEN}"}
API_URL = "https://api-inference.huggingface.co/models/openai/whisper-large-v3"


def query(filename):
    with open(filename, "rb") as f:
        data = f.read()
    response = requests.request("POST", API_URL, headers=headers, data=data)
    return response


data = query(audio_file_path)
print(data)

But this only leads to a 413 ‘Payload Too Large’ Error.

What I need is just to run the model to perform the speech recognition remotely, since I do not have the hardware to run it on my own.
I am willing to pay for pro or for paid services if they let me achieve what I want in a feasible manner (0.5€ per hour is not feasible, I would like to stay under 50€/month)

John6666 · October 12, 2024, 10:42pm

The default seems to be up to 2 MB. However, model authors and Spaces authors have the means to increase the limit…?
I also heard that the limit is relaxed when using Inference Endpoint.

Topic		Replies	Views
Support for ASR inference on longer audiofiles or on live transcription? 🤗Transformers	2	472	April 21, 2023
HuggingFace Inference endpoint 504 error Inference Endpoints on the Hub	3	801	January 30, 2024
Inference Model with API and Integrate to LM (Language Model) 🤗Transformers	0	636	June 7, 2022
Accelerated Inference API Automatic Speech Recognition Beginners	2	633	September 13, 2022
Openai/whisper-large-v3: Payload reached size limit Beginners	1	386	February 10, 2025

How to use Inference API to perform speech recognition

Related topics