Hi, in the documentation, it only states how to add audio files, but I want to add audio files and their transcriptions.
How can I do that so I can build a dataset of snippets / transcription that I can train on?
Also, if I want to have 2 separate datasets, one for test and one for training, what’s the approach to follow? Send everything and tag in the metadata.csv or create 2 folders and upload the snippets/transcription with?
Hi ! Here is an example in python:
ds = Dataset.from_dict({
"audio": ["path/to/audio_1", "path/to/audio_2", ..., "path/to/audio_n"],
"transcription": ["First transcript", "Second transcript", ..., "Last transcript"],
}).cast_column("audio", Audio())
Alternatively you can also define an AudioFolder (see docs):
my_dataset/
├── README.md
├── metadata.csv
└── data/
├── audio_0.wav
...
└── audio_n.wav
Also, if I want to have 2 separate datasets, one for test and one for training, what’s the approach to follow? Send everything and tag in the metadata.csv or create 2 folders and upload the snippets/transcription with?
You can structure your AudioFolder like this:
my_dataset/
├── README.md
├── metadata.csv
├── test/
| ├── audio_0.wav
| ...
| └── audio_n.wav
└── train/
├── audio_0.wav
...
└── audio_n.wav
It’s also possible to have one metadata.csv
in train/
and one in test/
if you want
1 Like