Hi! I have a problem with uploading MOCKS dataset to enable our prepared custom splits. I’ve followed guides and tutorials to do it in proper way but each time I got the same error. I want the viewer to show audio id, audio and transcription, and when I want to get the transcription from the tsv file (transcriptions are in second column) and use proper index (row) - I got IndexError: list out of range. Could you please help me figure this out? I’ve tried to read the tsv files without download_and_extract function and the result is the same. Thanks in advance!
I opened a PR that fixes the dataset script and makes it streamable here: voiceintelligenceresearch/MOCKS · Fix dataset script
Yes! Great thank you a lot! Do you know maybe why the all and es.MCV option is not available
Please merge this PR to fix the
all config fails to stream because
- is not allowed as a character in a split name. I’ll open a PR in the datasets lib to remove this limitation, but you’ll have to wait until the next
datasets release for the fix to be visible in the viewer.
ok sure, thank you again!