PreTrain Wav2Vec2 in Spanish
There is currently only a multilingually pretrained model for Spanish Wav2Vec2. Let’s make a Wav2Vec2 only pretrained on Spanish.
Model
A randomly initialized Wav2Vec2 model.
Datasets
We can use the Spanish portion of Common Voice. The dataset is available through the datasets
library here: common_voice · Datasets at Hugging Face.
Available training scripts
FlaxWav2Vec2 will be merged soon: [Flax] Add wav2vec2 by patrickvonplaten · Pull Request #12271 · huggingface/transformers · GitHub and a pretraining script should be relatively easy to be merged.
(Optional) Desired project outcome
The best Spanish ASR model.
(Optional) Challenges
It would be nice to use more data than just the Common Voice dataset.