khof312
September 12, 2024, 9:12am
1
I would like to use Meta’s MMS model with the InferenceClient. However it supports multiple languages and I need to pass the following keywords :
{"target_lang":"spa", "ignore_mismatched_sizes":True}
How can I do this? Here is my code:
from huggingface_hub import InferenceClient
client = InferenceClient()
client.automatic_speech_recognition("test.wav", model="facebook/mms-1b-all").text
It’s returning a result, but it clearly doesn’t recognize which language is being spoken.
Wauplin
September 12, 2024, 10:04am
3
Hi @khof312 , this is unfortunately not officially supported by InferenceClient
. You can do something like this:
from huggingface_hub import AutomaticSpeechRecognitionOutput, InferenceClient
from huggingface_hub.inference._common import _b64_encode
client = InferenceClient()
response = client.post(
json={
"inputs": _b64_encode("test.wav"),
"parameters": {
"target_lang": "spa",
"ignore_mismatched_sizes": True,
},
},
model="facebook/mms-1b-all",
)
output = AutomaticSpeechRecognitionOutput.parse_obj_as_instance(response)
but remember this is not an official support.
To add new inputs and parameters to the InferenceClient, you first need to add them to the “official” tasks specification here . Only popular parameter are added to avoid bloating the API (and constraining ourselves for future breaking changes).