[Open-to-the-community] XLSR-Wav2Vec2 Fine-Tuning Week for Low-Resource Languages

:hugs: Speech-To-Text in 60 languages :earth_americas: :earth_africa: :earth_asia:

Hi all,

We organize a community week (Mar 22th to Mar 29th) to fine-tune the cross-lingual speech recognition model XLSR-Wav2Vec2 on all languages of the crowd-sourced Common Voice dataset.

What it is about

The goal of the event is to provide state-of-the-art XLSR-Wav2Vec2 speech recognition models in as many languages as possible to the community. We hope that especially research in speech recognition for low-resource languages can profit from it.

How does it work

Participants have one week to fine-tune as many XLSR-Wav2Vec2 models as they want on as many of the ~60 Common Voice’s languages as they want. Each fine-tuned model should then be evaluated on Common Voice’s test data of the respective language. All data can be used as training data, except the official test data (which will be checked by the Hugging Face team). After the fine-tuning week, the best performing models of each language will receive :hugs: SWAG.

What do I need to do to participate

All you need is a google colab account to be able to run the fine-tuning script provided by us. If you can train the model on a local GPU - even better, but a free google colab is enough. This fine-tuning week should especially be interesting to you if you are a native speaker of a low-resource language because your language skills can help you get a better data processing pipeline than the competition.

If you want to participate all you need to do is to sign-up to the :hugs: hub here and post your name and your :hugs: hub username in this thread. The hugging face team will then add you to an internal Slack channel where you receive more in-detail information!

What do I get

  • enjoy a bit of HuggingFace vibe by joining the fine-tuning week
  • a fine-tuned model under your name on the :hugs: hub
  • hugging face SWAG if you manage to have the best performing model in a language

Open-sourcely yours,

Patrick & Suraj

46 Likes

Name : Laxya Agarwal
Username :laxya007

Very excited to be part of this competition.

2 Likes

Hey Patrick, feel free to add me!

4 Likes

Hello! Good news, i’m on it!

Name: Anton Nekrasov
Nickname: gorodecki

2 Likes

Name: Khursani
Username: khursani8

2 Likes

Name: Dewi Bryn Jones
Username: DewiBrynJones

1 Like

Name: Ceyda
Username: ceyda

2 Likes

Name: George Mazzeo
username:CupOfGeo

1 Like

Name: Ganesh
username: ganesh3

2 Likes

Name: Muhammad Agung Hambali
Username ayameRushia

2 Likes

Name: Manan Dey
Username: manandey

2 Likes

Name: Ozcan Gundes
Username: ozcangundes

2 Likes

Name: Karim Foda
Username: kmfoda

1 Like

I will work on Arabic.

Name: Zaid Alyafeai
Username: Zaid

4 Likes

Name: Harshit Rathore
Username: HarshitRathore

2 Likes

Name: Vasudev Gupta
Username: vasudevgupta

2 Likes

Name: Galuh Sahid
Username: Galuh

Will work on Indonesian

3 Likes