How to set audio language in Whisper Pipeline?

marshmellow77 · November 11, 2022, 10:30am

I’d like to use the Whisper model in an ASR pipeline for languages other than English, but I’m not sure how to tell the pipeline which language the audio file is in. Per default, it seems to actually understand the meaning of the audio (which is in German) but then always to translate it into English:

from transformers import pipeline
pipe = pipeline(task="automatic-speech-recognition", model="openai/whisper-small")
pipe("testfile.mp3")

# ground truth: "So sind verschiedene Überlandstrecken geplant."
# model prediction: "So are various crossings planned."

I tried adding language="de" when creating the pipeline or when calling the pipeline, but to no avail.

emilios · November 18, 2022, 3:25pm

@sanchit-gandhi I ve got the same problem with a fine tuned model according to your amazing blog post

AFischer1985 · March 12, 2023, 2:34pm

By default, the Whisper-pipeline adds IDs of some special tokens to the beginning of the data (pipe.model.config.forced_decoder_ids), namely [[1, 50259], [2, 50359], [3, 50363]].
The meaning of these token-IDs can be found in the “added_tokens.json” file

“<|en|>”: 50259,
“<|transcribe|>”: 50359,
“<|notimestamps|>”: 50363.

You may either set the first token-id to “<|de|>” (ID 50261) explicitly like this:

pipe.model.config.forced_decoder_ids[0][1]=50261

or you simply set the forced_decoder_ids to None, but in my experience this does not work as reliably for German input:

pipe.model.config.forced_decoder_ids = None

cardev212 · March 17, 2023, 1:24pm

Hey there, you can pass
forced_decoder_ids = processor.get_decoder_prompt_ids(language=language, task=task)

then in pipeline:
pipe(generate_kwargs={“forced_decoder_ids”: forced_decoder_ids}, … other args…)

findYing · April 28, 2023, 8:40am

hi, have you find the solution ?

rangehow · November 15, 2023, 12:33pm

All you need is
result = pipe(“*.mp3”,generate_kwargs={“language”: “english”})

ai-nikolai · December 2, 2024, 3:16pm

Thanks for this suggestion.

Quick follow-up: where can we find the language codes / languages of Whisper? (or how can we correctly translate a two letter language code to the correct whisper language?

Topic		Replies	Views
How to set language in Whisper pipeline for audio transcription? 🤗Transformers	2	8902	June 22, 2023
Open ai whisper fine tuning on unknown language Beginners	0	79	October 1, 2024
Language detection with Whisper Models	16	23946	February 25, 2025
Finetuned whisper model translating instead of transcribing 🤗Transformers	2	734	December 31, 2023
How to fine-tune whisper on unsupported language? Beginners	1	169	October 12, 2024

How to set audio language in Whisper Pipeline?

Related topics