Widget faster than inference Api?

amtam0 · November 15, 2021, 2:24pm

Hello,

I am getting some process time difference between widget (not cached) and inference Api between 1 to 2 seconds, when testing the same .wav file.

The model I tested : speechbrain/asr-wav2vec2-commonvoice-fr. I checked this model does not support accelerated inference.

I am wondering if this is a normal behavior?

To reproduce

Use the same wav file to test the model using

The widget : speechbrain/asr-wav2vec2-commonvoice-fr · Hugging Face (not the cached file)
The inference Api, code sample used :

%%time
import json
import requests
API_TOKEN = ""
model_id = "speechbrain/asr-wav2vec2-commonvoice-fr"
headers = {"Authorization": f"Bearer {API_TOKEN}"}
API_URL = "https://api-inference.huggingface.co/models/{}".format(model_id)
def query(filename, API_URL, headers):
    with open(filename, "rb") as f:
        data = f.read()
    response = requests.request("POST", API_URL, headers=headers, data=data)
    return json.loads(response.content.decode("utf-8"))
data = query("example2.wav", API_URL, headers)

Expected behavior

Similar process time

Topic		Replies	Views
Inference API timeout Site Feedback	0	187	May 29, 2024
Accelerated Inference API Automatic Speech Recognition Beginners	2	636	September 13, 2022
Different inference speed for finetuned Whisper models Beginners	0	407	February 28, 2024
Using inference api on model that returns an audio file Models	0	378	November 23, 2021
How to get Accelerated Inference API for T5 models? 🤗Hub	6	2750	March 24, 2022

Widget faster than inference Api?

To reproduce

Expected behavior

Related topics