XLSR-53: To group tokens or not to group tokens

jjdv · March 17, 2021, 7:59pm

In @patrickvonplaten 's Fine Tuning XLSR-53 notebook, he mention how tokens shall not be grouped when computing metrics, in the case of that notebook, the WER metric. And that does make sense. However, later on in the notebook, he goes on to use the processor to decode the predictions and doesn’t pass the group_tokens=False argument to the method.

Shouldn’t the way we decode to compute metrics and to output predictions be the same? Which way would be the correct one? This is probably a minor issue for languages that don’t duplicate graphemes that often, but I’m curious as it could impact the perceived performance one way or another.

Could someone clarify this for me?

patrickvonplaten · March 18, 2021, 6:57am

Hey @jjdv,

Could you check whether this issue answers your question: wav2vec2: `convert_tokens_to_string` contracts legitimately repeated characters · Issue #10619 · huggingface/transformers · GitHub?

Topic		Replies	Views
XLSR-Wav2Vec2 with punctuation Research	1	1388	October 12, 2022
Grouping Tokens after Token Classification Intermediate	1	599	January 6, 2022
Code review: compute_metrics for WER with Wav2Vec2ProcessorWithLM 🤗Transformers	4	1023	April 19, 2022
SentencePiece tokenizer Beginners	2	128	February 22, 2025
Decode token IDs into a list (not a single string) 🤗Tokenizers	4	4142	March 11, 2025

XLSR-53: To group tokens or not to group tokens

Related topics