Why Pipeline inferencing with CPU and pytorch for wav2vec only use 50% of cpu? and does chunk length impact the speed for model?

Hi, I am following Making automatic speech recognition work on large files with Wav2Vec2 in :hugs: Transformers (huggingface.co) and tries to use the pipeline to transcribe a long audio file. It is similar to a simple 3 liner

from transformers import pipeline
pipe = pipeline(model="facebook/wav2vec2-base-960h")
output = pipe("very_long_file.wav", chunk_length_s=10, stride_length_s=(4, 2))

My setup is based on cpu. I am trying to transcribe a 4 hour wav file, but when it started running, I saw the python program capped at 400% CPU (in Mac’s activity monitor) while in total I have 8 CPU available, so maximum should be 800%. How should I fix to get it to use all the cpus?

The second question is I understand the chunk length will impact the memory usage, but will that have major impact on processing speed? Will longer chunk length make it run faster? I understand stride length is overlap between different chunks so a long stride might make the code takes longer, but is there major impact in chunk length?