Why Pipeline inferencing with CPU and pytorch for wav2vec only use 50% of cpu? and does chunk length impact the speed for model?

Hicu · August 22, 2022, 12:53am

Hi, I am following Making automatic speech recognition work on large files with Wav2Vec2 in Transformers (huggingface.co) and tries to use the pipeline to transcribe a long audio file. It is similar to a simple 3 liner

from transformers import pipeline
pipe = pipeline(model="facebook/wav2vec2-base-960h")
output = pipe("very_long_file.wav", chunk_length_s=10, stride_length_s=(4, 2))

My setup is based on cpu. I am trying to transcribe a 4 hour wav file, but when it started running, I saw the python program capped at 400% CPU (in Mac’s activity monitor) while in total I have 8 CPU available, so maximum should be 800%. How should I fix to get it to use all the cpus?

The second question is I understand the chunk length will impact the memory usage, but will that have major impact on processing speed? Will longer chunk length make it run faster? I understand stride length is overlap between different chunks so a long stride might make the code takes longer, but is there major impact in chunk length?

kdcyberdude · July 26, 2024, 6:24pm

Hi @Hicu, Did you get the answer?

Topic		Replies	Views
Whisper on long audio files -- support for chunking? 🤗Transformers	3	5751	April 21, 2023
Help about Whisper chunk_length Beginners	1	195	February 15, 2025
File size/speech length limit for Wave2Vec2? Beginners	4	2374	June 24, 2023
Wav2Vec 2 audio processing Models	0	141	June 3, 2024
Wav2vec2 for long audiofiles Beginners	2	4135	March 18, 2022

Why Pipeline inferencing with CPU and pytorch for wav2vec only use 50% of cpu? and does chunk length impact the speed for model?

Related topics