Hi guys, I started this Topic regarding fine tuning Wav2Vec2 in Portuguese language. Let’s exhange knowledge and help to implement this model at HuggingFace
I’m training on Google Cloud with 2 T4 GPUs, which are quite inexpensive and using
screen to keep the code running, even if SSH disconnects for any reason. Steps to run screen properly:
$ script /dev/null $ screen
$ python training.py
To check what part of the code is running, SSH again an run:
$ screen -r
$ screen -d -r 3310
To quit screen:
$ screen -X -S 3310.pts-6.training quit
Do you think it will be possible to do it on Colab?
The dataset looks huge and becomes larger after all the caching of the processed data.
Hi Gunjan, I tried to train on Colab but I had a “memory full” error. As seen in our Slack, the model can be trained on OVH Cloud: https://www.ovh.ie/
- fill this form OVH - Wav2Vec2 - Fine Tuning week - GPU accounts - Google Sheets
- they will send a voucher code
- more in the discord channel: https://discord.gg/HaNEhBax
You can also try with 24GB RAM and NVIDIA P100 on Google Colab, make a copy of this notebook and use it: https://colab.research.google.com/drive/1D6krVG0PPJR2Je9g5eN_2h6JP73_NUXz
Current state of effort:
My main model wer got stuck at 0.3656 = 3e-4 became a high learning rate. This happened in checkpoint 2600. So, I decided to use the saved weights to run two models in parallel.
First model: take original model and decrease learning rate to 0.8e-4: result so far at checkpoint 400:
Second model: comment #model.freeze_feature_extractor(). Result so far at checkpoint 400 (this will take much longer to train):
Also, I had GPU issues so the training is not optimized.
Model 2 - comment #model.freeze_feature_extractor() aborted. GPU config issues + computing expensive.
Model 1 - decreased learning rate to 0.8e-4: wer = 0.33
Dataset = common voice = 1.7GB
Nice to know that you found a good set of hyperparameters.
I’m not a native speaker of Portuguese, did you change the vocab dict in any manner?
Oi pessoal, tudo bem?
Estou com dúvida de como fazer a predição
Estou fazendo baseado em um modelo de fine-tuning XLSR-Wav2Vec2
e não acho otimal apagar os labels em texto se depois vou usá-los para comparar
Mas está extremamente desgastante porque eles não explicam qual o formato do input para a função compute_metrics só falam que é pred…então fica um black blox dentro de outro black box…
não sei dar o forward e não tive um contato significativo com pytorch, ai quando vou tentar fazer predição por batch, pronto…
só dá erro…
vou acabar tendo que criar uma função que repete o que outra função faz porque não sei o que é preds…