I’d like to use the Whisper model in an ASR pipeline for languages other than English, but I’m not sure how to tell the pipeline which language the audio file is in. Per default, it seems to actually understand the meaning of the audio (which is in German) but then always to translate it into English:
from transformers import pipeline pipe = pipeline(task="automatic-speech-recognition", model="openai/whisper-small") pipe("testfile.mp3") # ground truth: "So sind verschiedene Überlandstrecken geplant." # model prediction: "So are various crossings planned."
I tried adding
language="de" when creating the pipeline or when calling the pipeline, but to no avail.