Hello everyone,
I want to do ASR with wav2vec 2.0 and the Common Voice German dataset.
After loading the data, I want to prepare the Wav2Vec2CTCTokenizer and feature extractor.
When I run:
common_voice_train[0][“path”]
I get the following RuntimeError:
RuntimeError Traceback (most recent call last)
in ----> 1 common_voice_train[0][“path”]
12 frames
/usr/local/lib/python3.8/dist-packages/torchaudio/backend/sox_io_backend.py in _fail_load_fileobj(fileobj, *args, **kwargs) 31 32 def _fail_load_fileobj(fileobj, *args, **kwargs): —> 33 raise RuntimeError(f"Failed to load audio from {fileobj}") 34 35
RuntimeError: Failed to load audio from <_io.BytesIO object at 0x7effc8da2900>
I work on a Google Colab notebook.
These are my versions:
GPU: A100-SXM4-40GB
Cuda-Version: 11.2
datasets==2.1.0
librosa==0.8.1
tensorflow==2.9.2
tokenizers==0.13.2
torch==1.13.0+cu116
torchaudio==0.13.0+cu116
transformers==4.24.0
Is there any problem with my versions and/or the Common Voice data?
Thank you very much in advance for helping me!