Hi, I want to evaluate my model with Librespeech dataset
Train.500 | Train.360 | Train.100 | Valid | Test | |
---|---|---|---|---|---|
clean | - | 104014 | 28539 | 2703 | 2620 |
other | 148688 | - | - | 2864 | 2939 |
I don’t need train.500
, train.360
, train.100
and valid
set. Is there any way to load only test
set from librispeech_asr
?
I can load all data and only keep test
set later but it takes too long to load all data :(( So I want to load only test
set.
from datasets import load_dataset, DatasetDict
# It takes forever to load everything here...
libre_dataset = load_dataset("librispeech_asr", 'clean')
keep = ["test"]
libre_dataset = DatasetDict({k: dataset for k, dataset in libre_dataset .items() if k in keep})
Thanks for reading!