Accelerated Inference API Automatic Speech Recognition

boxabirds · July 10, 2021, 6:32pm

Hi I’m trying to use the Automatic Speech Recognition API but the docs are … light.

When I copy/paste the example code from the docs (below for convenience):

import json

import requests

headers = {"Authorization": f"Bearer {API_TOKEN}"}
API_URL = "https://api-inference.huggingface.co/models/facebook/wav2vec2-base-960h"

def query(filename):
    with open(filename, "rb") as f:
        data = f.read()
    response = requests.request("POST", API_URL, headers=headers, data=data)
    return json.loads(response.content.decode("utf-8"))

data = query("sample1.flac")

… I get …

{
  "error": "Model facebook/wav2vec2-base-960h is currently loading",
  "estimated_time": 20
}

It then says in the docs “no other parameters are currently allowed”. Does this mean I can’t ask it to use a GPU for instance?

So

It’d be nice if the docs had sample code that worked out of the box. Developer UX is important.
It’d be nice if the docs also had documentation on the response format. For instance even the error result: estimated time: 20 is this minutes, days, centuries, nanoseconds?
A very common ASR feature is to have word-by-word timestamps for alignment use cases. Does this API support that or in any way harmonise the ASR engines underneath (SpeechBrain and another one)

I’m poised to shell out some big bucks for GPU-level support at HF but I need to see much more pro-level docs in this area.

kinso · December 6, 2021, 7:14am

@boxabirds I got same problem. looks like no-one responded to your post… did you work out the error?

dblandan · September 13, 2022, 5:47pm

The 503 error happens while the model is loading. The 20 is, i believe, seconds, but I find this to be generally inaccurate. I have a while loop with some logging that’s essentially:

while response.status_code == 503:
    response = requests.request("POST", API_URL, headers=headers, params={"wait_for_model": True}, data=data)

For the ASR model I was using, it took almost 3 minutes from cold start to first response. After that, everything goes smoothly.

Topic		Replies	Views
Widget faster than inference Api? Models	0	353	November 15, 2021
Automatic-speech-recognition problems Models	0	263	June 4, 2023
ASR inference time too long Beginners	1	311	February 25, 2021
Seeking detailed parameter docs on Wav2Vec via API Beginners	4	461	May 4, 2021
Google/pegasus-arxiv always times out Beginners	0	234	December 21, 2020

Accelerated Inference API Automatic Speech Recognition

Related topics