Dataset providing additional data for ASR pipelines

carlosjhc · May 4, 2022, 7:42am

Hi,

I create a pipeline for ASR with:

asr = get_offline_pipeline(‘automatic-speech-recognition’, local_dir)

And then use it for ASR with:

for inference in asr(inference_dataset, batch_size=100) …

For some of the inference data, I have the correct transcriptions, that I’d like to pass up to calculate WER. If I change inference_dataset to return a tuple (audio_data, transcription), the pipeline fails as it is not expecting a tuple. Is there any way to do this?

Thanks,
Carlos

Topic		Replies	Views
Pipeline inference with Dataset api 🤗Transformers	5	12039	November 15, 2023
Error Iterating over KeyDataset 🤗Datasets	0	30	August 30, 2024
What's the best way to speed up inference on a large dataset? Beginners	3	3905	March 13, 2022
Please, help me 🤗Datasets	1	621	January 10, 2022
Running ASR inference pipeline on multiple GPU's 🤗Transformers	0	131	February 19, 2024

Dataset providing additional data for ASR pipelines

Related topics