Speech-To-Text in 60 languages
What it is about
The goal of the event is to provide state-of-the-art XLSR-Wav2Vec2 speech recognition models in as many languages as possible to the community. We hope that especially research in speech recognition for low-resource languages can profit from it.
How does it work
Participants have one week to fine-tune as many XLSR-Wav2Vec2 models as they want on as many of the ~60 Common Voice’s languages as they want. Each fine-tuned model should then be evaluated on Common Voice’s test data of the respective language. All data can be used as training data, except the official test data (which will be checked by the Hugging Face team). After the fine-tuning week, the best performing models of each language will receive SWAG.
What do I need to do to participate
All you need is a google colab account to be able to run the fine-tuning script provided by us. If you can train the model on a local GPU - even better, but a free google colab is enough. This fine-tuning week should especially be interesting to you if you are a native speaker of a low-resource language because your language skills can help you get a better data processing pipeline than the competition.
If you want to participate all you need to do is to sign-up to the hub here and post your name and your hub username in this thread. The hugging face team will then add you to an internal Slack channel where you receive more in-detail information!
What do I get
- enjoy a bit of HuggingFace vibe by joining the fine-tuning week
- a fine-tuned model under your name on the hub
- hugging face SWAG if you manage to have the best performing model in a language
Patrick & Suraj