Hello Hugging Face community,
I’m working with the Whisper model for a speech-to-text task, specifically handling Dutch audio files. I’m utilizing the dedicated inference endpoint provided by Hugging Face. However, I’m facing a challenge in ensuring that the transcription output aligns with the Dutch language of the audio input.
Below is the code snippet I’m currently using:
import requests
API_URL = "-----"
headers = {
"Accept": "application/json",
"Authorization": "Bearer ----",
"Content-Type": "audio/wav"
}
def query(filename):
with open(filename, "rb") as f:
data = f.read()
response = requests.post(API_URL, headers=headers, data=data)
return response.json()
output = query("rec.wav")
print(output)
This setup successfully sends the audio file to the Hugging Face inference endpoint and receives a response. However, the language of the output transcription is not in Dutch but in Chinese.
How can I add a language code for dutch?