Why is that I am not getting the full file path; thus unable to play the audio file

vsrinivas · August 6, 2023, 12:58pm

I am working in Kaggle notebooks. What I am trying to do is save an audio file dataset to a folder on my present working directory using .save_to_disk method and then reload it using .file_from_disk method. However, when I do so, the file path shows just the filename instead of the full path. Since it is just a file name (a string), I am unable to play the audio file. How do I rectify this? Appreciate any inputs on what is wrong here? The code and error message are as follows:

sds

DatasetDict({
    train: Dataset({
        features: ['audio', 'sentence'],
        num_rows: 47
    })
    test: Dataset({
        features: ['audio', 'sentence'],
        num_rows: 3
    })
})

sds['train'][0]

{'audio': {'path': '/kaggle/working/traindata/f98ef22eeace.mp3',
  'array': array([ 0.00000000e+00, -8.73114914e-11,  0.00000000e+00, ...,
         -2.53720846e-05, -2.61863424e-05, -2.18840050e-05]),
  'sampling_rate': 16000},
 'sentence': 'হে, ঠিক আছে, মেয়ার আমি চাই তুমি শ্বাস নাও, ঠিক আছে?'}

sds.save_to_disk('/kaggle/working/split_datasets')
os.listdir('/kaggle/working')

['.virtual_documents', 'split_datasets', 'traindata']

split_datasets = load_from_disk('/kaggle/working/split_datasets')
split_datasets

DatasetDict({
    train: Dataset({
        features: ['audio', 'sentence'],
        num_rows: 47
    })
    validation: Dataset({
        features: ['audio', 'sentence'],
        num_rows: 3
    })
})

split_datasets['train'][0]

{'audio': {'path': 'f98ef22eeace.mp3',
  'array': array([ 0.00000000e+00, -8.73114914e-11,  0.00000000e+00, ...,
         -2.53720846e-05, -2.61863424e-05, -2.18840050e-05]),
  'sampling_rate': 16000},
 'sentence': 'হে, ঠিক আছে, মেয়ার আমি চাই তুমি শ্বাস নাও, ঠিক আছে?'}

As you see, the file path is just the file name - ‘path’: ‘f98ef22eeace.mp3’.

song_sample = random.choice(split_datasets["train"])
print(song_sample)
song_path = song_sample['audio']['path']
print(song_sample['sentence'])
AudioSegment.from_mp3(song_path)

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <module>:5                                                                                    │
│                                                                                                  │
│   2 print(song_sample)                                                                           │
│   3 song_path = song_sample['audio']['path']                                                     │
│   4 print(song_sample['sentence'])                                                               │
│ ❱ 5 AudioSegment.from_mp3(song_path)                                                             │
│   6                                                                                              │
│                                                                                                  │
│ /opt/conda/lib/python3.10/site-packages/pydub/audio_segment.py:796 in from_mp3                   │
│                                                                                                  │
│    793 │                                                                                         │
│    794 │   @classmethod                                                                          │
│    795 │   def from_mp3(cls, file, parameters=None):                                             │
│ ❱  796 │   │   return cls.from_file(file, 'mp3', parameters=parameters)                          │
│    797 │                                                                                         │
│    798 │   @classmethod                                                                          │
│    799 │   def from_flv(cls, file, parameters=None):                                             │
│                                                                                                  │
│ /opt/conda/lib/python3.10/site-packages/pydub/audio_segment.py:651 in from_file                  │
│                                                                                                  │
│    648 │   │   │   filename = fsdecode(file)                                                     │
│    649 │   │   except TypeError:                                                                 │
│    650 │   │   │   filename = None                                                               │
│ ❱  651 │   │   file, close_file = _fd_or_path_or_tempfile(file, 'rb', tempfile=False)            │
│    652 │   │                                                                                     │
│    653 │   │   if format:                                                                        │
│    654 │   │   │   format = format.lower()                                                       │
│                                                                                                  │
│ /opt/conda/lib/python3.10/site-packages/pydub/utils.py:60 in _fd_or_path_or_tempfile             │
│                                                                                                  │
│    57 │   │   close_fd = True                                                                    │
│    58 │                                                                                          │
│    59 │   if isinstance(fd, basestring):                                                         │
│ ❱  60 │   │   fd = open(fd, mode=mode)                                                           │
│    61 │   │   close_fd = True                                                                    │
│    62 │                                                                                          │
│    63 │   try:                                                                                   │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
FileNotFoundError: [Errno 2] No such file or directory: '174010aad27c.mp3'

Hope someone can clarify and provide inputs to resolve the problem.

Topic		Replies	Views
Why load_dataset on Audiofolder with metadata is returning Filenotfound error 🤗Datasets	6	1216	August 18, 2023
Loading train and test splits with `audiofolder` 🤗Datasets	5	1689	February 10, 2024
Audio files view error 🤗Datasets	7	920	March 27, 2023
Kaggle no such file 🤗Datasets	2	870	June 26, 2023
Error io.BufferReader 🤗Datasets	2	544	June 27, 2023

Why is that I am not getting the full file path; thus unable to play the audio file

Related topics