Wav2vec2 for long audiofiles

vladi315 · May 28, 2021, 1:23pm

Hi,

I’m trying to apply wave2vec2 models on long audiofiles (~1h) for speech to text.
However processing the entire audio file at once is not feasible because it requires more than 16GB. How can I import a sound file as audio stream into the wave2vec models?

constantinSch · June 2, 2021, 11:18am

Here is one way to do this with librosa.stream:

https://github.com/huggingface/transformers/issues/10366#issuecomment-786780983

patrickvonplaten · March 18, 2022, 4:38pm

This should actually now be extremely easy with the new chunking feature.

This blog post should help: Making automatic speech recognition work on large files with Wav2Vec2 in 🤗 Transformers

Topic		Replies	Views
File size/speech length limit for Wave2Vec2? Beginners	4	2369	June 24, 2023
Wav2Vec 2 audio processing Models	0	139	June 3, 2024
Decding Large Audio Files Using Wav2Vec2ForCTC Model Models	2	741	October 28, 2021
How to finetune wav2vec2.0-xlsr model with long audio files Beginners	1	825	September 6, 2022
How to use f"acebook/wav2vec2-large-xlsr-53"? Beginners	0	344	March 5, 2023

Wav2vec2 for long audiofiles

Related topics