Can Wav2vec 2.0 pertained using only 3 hours of speech?

elsayedissa · February 22, 2023, 9:21pm

Hello all,

If no, how much data is required for the pertaining process?

Thanks.

pearlyu · February 24, 2023, 3:05am

Hi, I suggest plotting a learning curve to check if it has flattened out. I found a paper experimenting on the number of instances needed to fine-tune wav2vec for a classification task. I think 3hr is not impossible depending on your data and task. So you can try and check on the performance. But a learning curve can better visualize if you’ll need to acquire more data points.

elsayedissa · February 27, 2023, 5:56pm

Thank you @pearlyu Do you a link to the paper you mentioned?

Topic		Replies	Views
Using wav2vec2 for own usecase Beginners	2	323	May 13, 2021
Wav2vec2 not converging when finetuning 🤗Transformers	7	2581	June 15, 2021
I want to custom my data set in speech recognition wav2vec Beginners	1	832	August 9, 2021
How much data were used to pre-train facebook/wav2vec2-base 🤗Transformers	1	396	January 14, 2022
Wav2Vec: Best WER for hours < 30 finetuning 🤗Transformers	0	219	January 12, 2023

Can Wav2vec 2.0 pertained using only 3 hours of speech?

Related topics