Failed to load audio from io.BytesIO object

Hello everyone,

I want to do ASR with wav2vec 2.0 and the Common Voice German dataset.

After loading the data, I want to prepare the Wav2Vec2CTCTokenizer and feature extractor.

When I run:
common_voice_train[0][“path”]

I get the following RuntimeError:

RuntimeError Traceback (most recent call last)

in ----> 1 common_voice_train[0][“path”]


12 frames

/usr/local/lib/python3.8/dist-packages/torchaudio/backend/sox_io_backend.py in _fail_load_fileobj(fileobj, *args, **kwargs) 31 32 def _fail_load_fileobj(fileobj, *args, **kwargs): —> 33 raise RuntimeError(f"Failed to load audio from {fileobj}") 34 35

RuntimeError: Failed to load audio from <_io.BytesIO object at 0x7effc8da2900>

I work on a Google Colab notebook.
These are my versions:
GPU: A100-SXM4-40GB
Cuda-Version: 11.2
datasets==2.1.0
librosa==0.8.1
tensorflow==2.9.2
tokenizers==0.13.2
torch==1.13.0+cu116
torchaudio==0.13.0+cu116
transformers==4.24.0

Is there any problem with my versions and/or the Common Voice data?

Thank you very much in advance for helping me!

Hi, were you able to find a solution for this ?

1 Like