The KTH division of speech music and hearing are working on making a Swedish Wav2Vec2 model.
Our first step will be to use the multi-language-model to fine-tune on Swedish voices / transcriptions to make a Swedish Speech to Text model.
You are welcome to help out, this thread might also be useful for people working on similar tasks in another language.
I’ve added most of 's Wav2Vec2 code and I’m more than happy to help at fine-tuning the multi-language checkpoint for Swedish. So feel free to tag me in this thread for any questions you might have.
Also, I’m planning on releasing an in-detail notebook about fine-tuning Wav2Vec2 in a couple of days, which I’ll link here.
hi @patrickvonplaten I too am also interested in fine-tuning wav2vec2ctc and would be interested in the notebook. thank you!
Checking on when this notebook would be ready for us mere mortals.
@patrickvonplaten @valhalla thanks!
This is very good news!
It’s a simple step into speech recognition.