Indonesian ASR: Fine-Tuning Wav2Vec2

I’m in, it could be easier to achieve lower WER if we combine our effort, but how do we define what works should someone do?

1 Like

This is some sample from YouTube Closed Captions : http://148.251.140.232/youtube-asr/

We can create like telegram group maybe :thinking:

I have also requested dataset from TITML-IDN - Speech Resources Consortium yesterday. Will update if I hear back.

If everyone uses Telegram I can create one - feel free to send me a message with your username and I’ll add you. Maybe we can divide our work there. I was thinking some of us can focus on modifying a certain hyperparam (e.g. epoch/learning rate etc.) with other hyperparams being fixed, while the others focus on integrating certain datasets (when available). Then we can combine the best results of our experiments for the final model. What do you all think?

1 Like

It looks nice. But many voices are longer than the text in json file. For example the sound file http://148.251.140.232/youtube-asr/056yb0seQWc/10517504935501761652288671477516212840.mp3, the person said: “karena di sekolah lamanya dia kedapatan merampok dan hampir membunuh seorang de…”, where the json file has only this text “karena di sekolah lamanya dia kedapatan”

my telegram name is @CahyaWirawan

My telegram is @magungh1

My telegram username is @Wikidepia

Hi @munggok, I see you already started with wav2vec few days ago, do you want to join our discussion here? :slight_smile:

hi @cahya ,sure. count me in
would love to join

i manage to fine tuned xlsr-indonesia week ago and got 0.401 eval wer
the log : (trainer_state.json · munggok/xlsr_indonesia at main)

basically i use same step as the tutorial for fine tuning xlsr by patrick but train it bit longer (change the epoch to 60) and use larger batch size (32)

as for improvement,i agree ,we need additional dataset outside common voice perhaps

putting the link indonesia dataset speech that hasn’t been posted here(CMIIW)

Great, please join also our telegram channel where we chat more :slight_smile: and the list where we split our tasks Fine-Tune XLSR-Wav2Vec2 on Indonesian ASR with 🤗 Transformers - Checklist - Google Sheets

What’s your Telegram ID? I’ll add you to the group

Okay

My telegram ID @acul3

Hi everyone, sorry I am too late to join the discussion.
I am new to this topic, hopefully, can learn a lot from this.

I’ve read the fine-tuning checklist on google spreadsheet. I’d like to contribute to combine Dataset common voice with shaip.

Please add me to the Telegram group @yasirabdr.

1 Like

this blog is very useful and relevan with article i’ve read, for more detail you can visit https://devel.pusatbahasa.unair.ac.id/V1/sejarah-singkat/