What is the language modeling loss (for next-token prediction) for HuBERT model?

alerio · May 9, 2022, 2:36am

I came across the HubertForCTC document, it says that the loss for this specific model is the language model loss for the next token. Can someone explain what is this “next token” in ASR model?

For example, we have a sentence “The quick brown fox jumps over the lazy dog”, and language model losses (for the next token) from other NLP models (i.e., BERT, GPT-2) are related to the next predicted “word” w.r.t. this sentence. Is the next token language model loss for ASR model also related the next predicted “word”? Or it’s actually the next predicted frame from the input wav, which is associated with a character?

Thanks in advance!

Topic		Replies	Views
Modeling_bert use next-token prediction? 🤗Transformers	4	167	September 10, 2024
Hubert ASR Fine Tuning giving weird results Models	1	1334	January 14, 2022
Understanding the encoder-decoder loss calculation VS CLM loss Beginners	0	344	February 21, 2024
What model checkpoint do I use if I trained a Word Piece tokenizer? Beginners	2	304	August 30, 2021
BERT Next Sentence Prediction: How to do predictions? Beginners	5	7548	September 29, 2022

What is the language modeling loss (for next-token prediction) for HuBERT model?

Related topics