Two Whisper classes for generation but same functionalities?

alerio · June 24, 2024, 6:26pm

Are there any differences between WhisperForConditionalGeneration and WhisperForCausalLM? From the documentation, they are very similar to each other.

For WhisperForConditionalGeneration, it says:

The Whisper Model with a language modeling head. Can be used for automatic speech recognition. This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

And for WhisperForCausalLM:

Whisper decoder with a language modeling head on top (linear layer with weights tied to the input embeddings). This model inherits from PreTrainedModel. Check the superclass documentation for the generic methods the library implements for all its model (such as downloading or saving, resizing the input embeddings, pruning heads etc.)

Looks like both of them have a language modeling head on the top. But are there any other differences for these classes?

Best

Jingya · July 16, 2024, 12:28pm

Hi @alerio,

I had the same question, and it turns out that WhisperForCausalLM is the class solely used to load the assistant model for speculative decoding.

Without loading the whole encoder-decoder, WhisperForCausalLM only loads the decoder with a language modeling head on top.

You can see more details from the initial PR from Patrick: [WhisperForCausalLM] Add WhisperForCausalLM for speculative decoding by patrickvonplaten · Pull Request #27195 · huggingface/transformers · GitHub

system · July 17, 2024, 12:28am

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Is it possible to use WhisperModel for an audio classification task? 🤗Transformers	0	381	November 18, 2022
T5forConditionalGeneration Beginners	2	2283	September 15, 2020
Difference between CausalLM and LMHeadModel Models	1	4116	April 25, 2022
About finetuning whisper 🤗Transformers	0	214	May 5, 2023
Decode whisper logits to transcript using forward instead of generate method 🤗Transformers	3	1864	December 20, 2022

Two Whisper classes for generation but same functionalities?

Related topics