While finetuning w2v-bert, the WER is not decreasing

andrewbawitlung · September 29, 2024, 1:36pm

While finetuning w2v-bert, the WER is not decreasing. I’ve fine-tuned wav2vec2, xls-r with no problem but w2v-bert has some issune while fine tuning, can anyone help?

Step	Training Loss	Validation Loss	Wer
300	4.636700	inf	0.993919
600	18.166900	nan	1.000000
900	0.000000	nan	1.000000
1200	0.000000	nan	1.000000

Below is my collab notebook.

John6666 · September 30, 2024, 5:29am

I’m not familiar with voice model training, but something doesn’t seem right. Take a look at the following page. Maybe it means there are some environments that work well.

I’ve been modifying some Spaces recently, and it occurred to me that all of the voice-related libraries are very old. I think it is possible that some of the libraries that are not explicitly specified in the pip may be doing something wrong.

In the context of generative AI, even functions that are 6 months old may be obsolete. Even if there is no syntax error, the content may have changed, which is tricky.

github.com

huggingface/blog/blob/main/fine-tune-w2v2-bert.md#training-1

---
title: "Fine-Tune W2V2-Bert for low-resource ASR with 🤗 Transformers"
thumbnail: /blog/assets/fine-tune-w2v2-bert/w2v_thumbnail.png
authors:
- user: ylacombe
---

# **Fine-Tune W2V2-Bert for low-resource ASR with 🤗 Transformers**

<!-- {blog_metadata} -->
<!-- {authors} -->

<a target="_blank" href="https://colab.research.google.com/github/ylacombe/scripts_and_notebooks/blob/main/Fine_Tune_W2V2_BERT_on_CV16_Mongolian.ipynb">
    <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

***New (01/2024)***: *This blog post is strongly inspired by "[Fine-tuning XLS-R on Multi-Lingual ASR](https://huggingface.co/blog/fine-tune-xlsr-wav2vec2)" and ["Fine-tuning MMS Adapter Models for Multi-Lingual ASR"](https://huggingface.co/blog/mms_adapters)*.

## Introduction

This file has been truncated. show original

mayank64ce · September 30, 2024, 5:45pm

Can you manually try printing the loss and gradient norm ? With debugger maybe ?

See if the train_loss/gradient norm is becoming infinite. In that case, your program might be suffering from gradient explosion problem.

amaniopia · June 6, 2025, 9:01am

Hello,
Did you manage to fix this, I’m facing the same issue. Thanks!

John6666 · June 6, 2025, 10:44am

I found tips?

For eval_loss=NaN, you can set config.ctc_zero_infinity = False (docs here)

Spanish is a much larger dataset in Common Voice than Galician or Mongolian, so you need to adapt hyper-parameters and the training config: Ideally, you would train for a much larger number of epochs on a lower learning rate.

Topic		Replies	Views
Fine-tuning Wav2v2.0: Loss increasing, WER decreasing Models	2	580	June 30, 2023
Finetuning Wav2Vec2 loss constant Beginners	1	301	August 14, 2023
Wav2Vec2: loss growing in training and validation after few epochs Models	6	2046	September 25, 2024
Wav2Vec2 ASR Fine tuneing Improvement Beginners	0	174	November 7, 2023
Training stops when I try Fine-Tune XLSR-Wav2Vec2 for low-resource ASR Beginners	2	376	August 5, 2021

While finetuning w2v-bert, the WER is not decreasing

Related topics