Different versions of 'wav2vec2' model and their differences

ThomasG · August 5, 2021, 12:58pm

Hey everyone. I want to use wav2vec2 to perform ASR using data in my language (Greek). As such, I took a look at the various wav2vec2 pretrained models that exist in the model hub, and there are two things I don’t understand:

Some versions, like this facebook/wav2vec2-large-lv60 · Hugging Face, say in the description that the model ‘should be fine-tuned on a downstream task, like Automatic Speech Recognition’. On the other hand, other versions like ‘facebook/wav2vec2-large-960h-lv60’ (sorry, can’t post more than 2 links), impose no such requirement and also provide code snippets as an example of how to use the particular model.

Furthermore, the group of models mentioned first, do not have code examples, but contain links to this amazing blog post Fine-Tune XLSR-Wav2Vec2 for low-resource ASR with 🤗 Transformers which I studied.

Forgive me if my question is getting too big, but I’d like to ask something more related to this blog post. I noticed 2 things: (1) the author does not load the tokenizer and feature extractor using the ‘from_pretrained()’ method, but instantiates them by themselves and (2) the second group of models mentioned earlier (those who include code examples in their page), do not use a feature extractor at all. What are the reasons behind these distinctions?

Sorry again for the lengthy question. I’d really appreciate any help. Thanks in advance!

ThomasG · August 7, 2021, 9:16pm

A mistake I noticed in my post: the first link is meant to redirect to this version of wav2vec: facebook/wav2vec2-large-xlsr-53 · Hugging Face

Topic		Replies	Views
Wav2vec2-base task performance Models	4	890	May 8, 2023
Facebook/wav2vec2-large-xlsr-53 on the hub: tokenizer issue 🤗Hub	4	4034	March 18, 2022
Understanding Wav2vec2Processor Beginners	0	330	December 14, 2021
Live Transcription/ASR Beginners	0	1649	September 18, 2022
What does Wav2Vec2Tokenizer do?and what is the difference between it and Wav2Vec2FeatureExtractor? Beginners	0	299	May 12, 2023

Different versions of 'wav2vec2' model and their differences

Related topics