PreTrain Wav2Vec2 in Spanish

PreTrain Wav2Vec2 in Spanish

There is currently only a multilingually pretrained model for Spanish Wav2Vec2. Let’s make a Wav2Vec2 only pretrained on Spanish.


A randomly initialized Wav2Vec2 model.


We can use the Spanish portion of Common Voice. The dataset is available through the datasets library here: common_voice · Datasets at Hugging Face.

Available training scripts

FlaxWav2Vec2 will be merged soon: [Flax] Add wav2vec2 by patrickvonplaten · Pull Request #12271 · huggingface/transformers · GitHub and a pretraining script should be relatively easy to be merged.

(Optional) Desired project outcome

The best Spanish ASR model.

(Optional) Challenges

It would be nice to use more data than just the Common Voice dataset.

1 Like

I am in!
cc: @patrickvonplaten @valhalla


Awesome, added you both to the team :slight_smile:


I am interested on this one as well, if there’s still time to join! :slight_smile:


I took part in the Wav2Vec2 fine-tuning week for Spanish, this is my forum post. My model card describes some of the pre-processing and training steps I took.

It would be awesome to create a pretrained model using Jax/Flax. I’m not sure I’ll have the time to take part in this one (I already signed up for a different project, and I know nothing about Jax), but I’ll try to follow your discussions if I can. Good luck!