Timestamps reduce Whisper hallucinations?

sanchit-gandhi · March 5, 2024, 2:49pm

There are several threads where it’s claimed that using return_timestamps=True in Whisper grounds the model and discourages from hallucinating. Are there any pointer about why that helps?

sanchit-gandhi · March 5, 2024, 2:50pm

I don’t think it’s something that’s been explored formally, but empirically most people have found that setting return_timestamps=True helps reduce hallucinations, particularly when doing long-form evaluation with Transformers’ “chunked” algorithm (note that timestamps are a requirement for OpenAI’s “sequential” one)

My interpretation is that forcing the model to predict timestamps is contradictory to hallucinations. Suppose you have the transcription:

The cat sat on the on the on the mat.

Where we have a repeated hallucination for “on the”. If we ask the model to predict timestamps, then the “on the” has to contribute to the overall segment-level timing, e.g.:

<|0.00|> The cat sat on the on the on the mat.<|5.02|>

But it’s impossible to fit 3 copies of “on the” within the time allocation given to the segment, so the probability for this hallucinatory sequence becomes lower, and the model actually predicts the correct transcription with highest probability:

<|0.00|> The cat sat on the mat.<|5.02|>

Interesting there’s not much formal exploration on this. The Whisper authors focus more on other heuristics for long-form transcription, such as temperature fallback (c.f. Section 4.5 of the paper)

The end timestamp is kind of the opposite of the initial timestamp constraint they describe in the paper → it helps the model remove extra words at the end of the sequence (rather than the initial timestamp which helps when the model ignores words at the start), but the overall principle is the same (using timestamps to improve the probability of more realistic sequences)

Topic		Replies	Views
Whisper fine-tuning and retaining timestamp decoding Models	5	1321	December 12, 2024
Disable timestamps for Whisper Beginners	1	2635	May 26, 2024
Inference API error with Whisper, return_timestamps parameter Inference Endpoints on the Hub	13	562	April 25, 2025
Whisper warning about not predicting end of a timestamp 🤗Transformers	1	1489	June 20, 2025
[Whisper] Help me understand output_attentions & Whisper's attention mechanism options Models	0	220	September 9, 2024

Timestamps reduce Whisper hallucinations?

Related topics