Adding custom vocabularies on Whisper

Here are 2 other approaches.
No training required, so I highly recommend trying this before fine-tuning models or changing their architecture.

1. Initial Prompt

You can simply use the parameter initial_prompt to create a bias towards your vocabulary.
In your example, you could write: "Let's talk about International Monetary Fund and SDRs."
This will encourage the model to repeat the term SDRs and other terms related to finances.


or…

2. Suppress Tokens

Sometimes whisper keeps using a wrong word. It that’s the case, you may suppress that token.

For example, let’s pretend there’s a Latin name “Esthear” and whisper transcribes to “I hold access to Esthear’s…”
Pretend this name is represented by tokens:
("Esthe", "ar")(98765, 12345)

If you suppress the token “Esthe”, Whisper will need to come up with alternatives to transcribe your audio… And hopefully guessing “SDR” correctly.

But be careful not to suppress common tokens. If you suppress both tokens “Esthe” + “ar”, it might impact other words, like “mo-net-ar-y”, tow-ar-ds".


Code Example

initial_prompt = "Let's talk about International Monetary Fund and SDRs."
model.transcribe(audio_file, initial_prompt=initial_prompt, suppress_tokens=[98765]
2 Likes