Hey folks!
First off huge thanks to @patrickvonplaten and @valhalla for putting this challenge together, really excited to do some Irish language training!
I’m also really happy to say that the team here at Weights & Biases would like to support the community with their training as much as we can!
(If you haven’t used Weights & Biases before you can find brand new docs for our Hugging Face integration here or check out the XLSR colab linked below)
Quick Summary
- Weights & Biases will create public language-specific W&B Projects so that multiple people can collaborate effectively on the same language, add your language here
- We have a beta dataset visualisation feature that I think is suuper useful to explore speech datasets
- We have a W&B XLSR Colab to show off how best to instrument your code to log your models training progress as well as upload and version your datasets, tokenizers, processors and models (before logging your best model to the HF Model Hub
).
Language-specific W&B Projects - just ask
In order to help organise multiple people working on the same language, we are happy to create public language-specific W&B Projects that anyone will be able to log their results, datasets, models, tokenizers etc to. This way folks working on the same language can work as a team and can easily share results and see the configs and hyperparameters that were used for specific model runs.
Go here to add the language-specific project you’d like us to request
W&B Dataset Visualisation [Beta]
I’m suuuper excited about using this feature to quickly explore speech datasets
With this new W&B feature (still in Beta) you can easily explore rich media tables to better understand your speech dataset. I’ve made a quick video demo which I think will best explain the value of this feature for EDA of rich media such as audio and video.
To see the code that created this rich media table in W&B Artifacts, see the W&B XLSR Colab
This is still in beta and the team would love to hear any feedback you have on it, please feel free to ping me about it and I can pass it on to the team Docs are here for more info.
W&B XLSR Colab
We have also created a W&B XLSR Colab with setup and training from top to bottom to show how you can get the most out of Weights & Biases. Get wandb
setup to log your models training as well as version your datasets, tokenizers, processors and models!
To make finding relevant wandb
code a bit easier, the relevant headings in the notebook start with “WANDB: …” Just search “wandb” to jump through and find the wandb code you’re looking for!
Let us know how it goes!
Let use know how integrating and using W&B goes and whether you have any issues! I’ll be active here and in the Hugging Face XLSR slack channel to answer any Weights & Biases questions you might have! @boris will also be able to help out too!
Best of luck with the challenge everyone!!