How to clip audio files in an audio dataset?


I’m trying to use common_voice dataset, but I want to keep the audio files to a maximum of 5 seconds. How can I achieve that?

Hi ! You can use filter to only keep the files that are less than 5 seconds:

from datasets import load_dataset

def is_short(example, max_length_in_secconds=5):
    arr = example["audio"]["array"]
    sampling_rate = example["audio"]["sampling_rate"]
    length_in_seconds = arr.shape[0] / sampling_rate
    return length_in_seconds < max_length_in_secconds

ds = load_dataset("common_voice", "ab", split="train")
ds = ds.filter(is_short)