Hello,
I’ve come across many tutorial articles but this is one of the most well organized and insightful course I’ve encountered so far. Amazing work and I really appreciate your time putting these goodies together.
While I was cherishing every bit of information in the course, I come across the situation - likely my local file structure problem - where I wasn’t able to run:
new_column = [librosa.get_duration(path=x) for x in minds["path"]]
because librosa
can’t find the file at the path provided by minds
, outputting the following error message:
/var/folders/4b/xjs02jw50cn561qgx_9d2z200000gn/T/ipykernel_78132/3624879537.py:2: FutureWarning: PySoundFile failed. Trying audioread instead.
Audioread support is deprecated in librosa 0.10.0 and will be removed in version 1.0.
new_column = [librosa.get_duration(path=x) for x in minds["path"]]
---------------------------------------------------------------------------
LibsndfileError Traceback (most recent call last)
File /opt/homebrew/lib/python3.11/site-packages/librosa/core/audio.py:795, in get_duration(y, sr, S, n_fft, hop_length, center, path, filename)
794 try:
--> 795 return sf.info(path).duration # type: ignore
796 except sf.SoundFileRuntimeError:
File /opt/homebrew/lib/python3.11/site-packages/soundfile.py:467, in info(file, verbose)
460 """Returns an object with information about a `SoundFile`.
461
462 Parameters
(...)
465 Whether to print additional information.
466 """
--> 467 return _SoundFileInfo(file, verbose)
File /opt/homebrew/lib/python3.11/site-packages/soundfile.py:412, in _SoundFileInfo.__init__(self, file, verbose)
411 self.verbose = verbose
--> 412 with SoundFile(file) as f:
413 self.name = f.name
File /opt/homebrew/lib/python3.11/site-packages/soundfile.py:658, in SoundFile.__init__(self, file, mode, samplerate, channels, subtype, endian, format, closefd)
656 self._info = _create_info_struct(file, mode, samplerate, channels,
657 format, subtype, endian)
--> 658 self._file = self._open(file, mode_int, closefd)
659 if set(mode).issuperset('r+') and self.seekable():
660 # Move write position to 0 (like in Python file objects)
File /opt/homebrew/lib/python3.11/site-packages/soundfile.py:1216, in SoundFile._open(self, file, mode_int, closefd)
1215 err = _snd.sf_error(file_ptr)
-> 1216 raise LibsndfileError(err, prefix="Error opening {0!r}: ".format(self.name))
1217 if mode_int == _snd.SFM_WRITE:
1218 # Due to a bug in libsndfile version <= 1.0.25, frames != 0
1219 # when opening a named pipe in SFM_WRITE mode.
1220 # See http://github.com/erikd/libsndfile/issues/77.
LibsndfileError: Error opening '/storage/hf-datasets-cache/all/datasets/27907695716030-config-parquet-and-info-PolyAI-minds14-941a5af2/downloads/extracted/a87e442545495cdb67dfdcbc9d4f35d234c9f8e471449b2db58d7c81b62f001a/en-AU~PAY_BILL/response_4.wav': System error.
During handling of the above exception, another exception occurred:
FileNotFoundError Traceback (most recent call last)
Cell In[14], line 2
1 # use librosa to get example's duration from the audio file
----> 2 new_column = [librosa.get_duration(path=x) for x in minds["path"]]
3 minds = minds.add_column("duration", new_column)
5 # # use 🤗 Datasets' `filter` method to apply the filtering function
6 # minds = minds.filter(is_audio_length_in_range, input_columns=["duration"])
7
8 # # remove the temporary helper column
9 # minds = minds.remove_columns(["duration"])
10 # minds
Cell In[14], line 2, in <listcomp>(.0)
1 # use librosa to get example's duration from the audio file
----> 2 new_column = [librosa.get_duration(path=x) for x in minds["path"]]
3 minds = minds.add_column("duration", new_column)
5 # # use 🤗 Datasets' `filter` method to apply the filtering function
6 # minds = minds.filter(is_audio_length_in_range, input_columns=["duration"])
7
8 # # remove the temporary helper column
9 # minds = minds.remove_columns(["duration"])
10 # minds
File /opt/homebrew/lib/python3.11/site-packages/librosa/core/audio.py:804, in get_duration(y, sr, S, n_fft, hop_length, center, path, filename)
796 except sf.SoundFileRuntimeError:
797 warnings.warn(
798 "PySoundFile failed. Trying audioread instead."
799 "\n\tAudioread support is deprecated in librosa 0.10.0"
(...)
802 category=FutureWarning,
803 )
--> 804 with audioread.audio_open(path) as fdesc:
805 return fdesc.duration # type: ignore
807 if y is None:
File /opt/homebrew/lib/python3.11/site-packages/audioread/__init__.py:127, in audio_open(path, backends)
125 for BackendClass in backends:
126 try:
--> 127 return BackendClass(path)
128 except DecodeError:
129 pass
File /opt/homebrew/lib/python3.11/site-packages/audioread/rawread.py:59, in RawAudioFile.__init__(self, filename)
58 def __init__(self, filename):
---> 59 self._fh = open(filename, 'rb')
61 try:
62 self._file = aifc.open(self._fh)
FileNotFoundError: [Errno 2] No such file or directory: '/storage/hf-datasets-cache/all/datasets/27907695716030-config-parquet-and-info-PolyAI-minds14-941a5af2/downloads/extracted/a87e442545495cdb67dfdcbc9d4f35d234c9f8e471449b2db58d7c81b62f001a/en-AU~PAY_BILL/response_4.wav'
To debug the issue, I print minds[0]['path']
and I get /storage/hf-datasets-cache/all/datasets/27907695716030-config-parquet-and-info-PolyAI-minds14-941a5af2/downloads/extracted/a87e442545495cdb67dfdcbc9d4f35d234c9f8e471449b2db58d7c81b62f001a/en-AU~PAY_BILL/response_4.wav.
Which doesn’t seem too weird and librosa
is trying to find the audio file from that path.
What could be the issue?
Thank you for your help