Reduced WavLMForXVector performance on LibriSpeech

romaniukm · February 3, 2023, 3:17pm

Hi,

I’ve been benchmarking WavLMForXVector on LibriSpeech data and the result I get is EER = 4.7% while the WavLM paper (table II) quotes EER = 0.84% for WavLM Base+.

I used the example code from the docs (WavLM), but loading the data from a hard drive with the soundfile library. I also noticed that the example code seems to be missing the adaptive s-norm component that they used in the paper but I wonder if this would be enough to cause the performance to worsen so much.

Any ideas what I’m getting wrong?

romaniukm · February 7, 2023, 8:15pm

There is a mistake in my original question - I used the VoxCeleb dataset for testing, not LibriSpeech

Topic		Replies	Views
Common-voice + librispeech Beginners	0	395	June 29, 2022
German ASR: Fine-Tuning Wav2Vec2 Languages at Hugging Face	17	3681	February 18, 2022
Wav2vec2 not converging when finetuning 🤗Transformers	7	2533	June 15, 2021
Wav2vec2-base task performance Models	4	890	May 8, 2023
Effect of different sample rates while finetuning an XLSR ASR model Models	0	252	April 27, 2023

Reduced WavLMForXVector performance on LibriSpeech

Related topics