How to set audio language in Whisper Pipeline?

I’d like to use the Whisper model in an ASR pipeline for languages other than English, but I’m not sure how to tell the pipeline which language the audio file is in. Per default, it seems to actually understand the meaning of the audio (which is in German) but then always to translate it into English:

from transformers import pipeline
pipe = pipeline(task="automatic-speech-recognition", model="openai/whisper-small")
pipe("testfile.mp3")

# ground truth: "So sind verschiedene Überlandstrecken geplant."
# model prediction: "So are various crossings planned."

I tried adding language="de" when creating the pipeline or when calling the pipeline, but to no avail.

2 Likes

@sanchit-gandhi I ve got the same problem with a fine tuned model according to your amazing blog post