Wav2Vec2 WER remains 1.00 and return blank transcriptions

Hello everyone. I faced a strange problem and not sure about how i can resolve the problem myself.
I was fine tuning wav2vec2 on two custom dataset and although previously it used to work fine now I am not sure what changed.

Issue#1 : The WER does not decrease, it always remains constant at 1.00 WER .
Issue#2 : When you make predictions you get blank transciptions.It happened twice and then i decided to run exactly on English Fine tuning blog post.

Fine-Tune Wav2Vec2 for English ASR with :hugging_face: Transformers

I am attaching an image below, the only change i made is on train and eval steps to experiment faster.
Please run the following colab notebook to reproduce results : Google Colab

Screenshot from 2021-07-30 22-54-23

Hey there

I was experiencing the same problem on my own dataset. I managed to solve it by using librosa to load the raw data. librosa.load(filename, sr = 16000)
Previously i was using pydub.AudioSegment

Same issue. Were you able to figure it out?

Can you share a detailed configuration. Also share the colab notebook if possible ??

absl-py==0.12.0
aiohttp==3.8.0
aiosignal==1.2.0
alabaster @ file:///home/ktietz/src/ci/alabaster_1611921544520/work
anaconda-client==1.7.2
anaconda-navigator==2.0.3
anaconda-project @ file:///tmp/build/80754af9/anaconda-project_1610472525955/work
anyio @ file:///tmp/build/80754af9/anyio_1617783275907/work/dist
appdirs==1.4.4
argh==0.26.2
argon2-cffi @ file:///tmp/build/80754af9/argon2-cffi_1613037097816/work
asn1crypto @ file:///tmp/build/80754af9/asn1crypto_1596577642040/work
astroid @ file:///tmp/build/80754af9/astroid_1613500854201/work
astropy @ file:///tmp/build/80754af9/astropy_1617745353437/work
astunparse==1.6.3
async-generator @ file:///home/ktietz/src/ci/async_generator_1611927993394/work
async-timeout==4.0.0
atomicwrites==1.4.0
attrs @ file:///tmp/build/80754af9/attrs_1604765588209/work
audioread==2.1.9
autopep8 @ file:///tmp/build/80754af9/autopep8_1615918855173/work
Babel @ file:///tmp/build/80754af9/babel_1607110387436/work
backcall @ file:///home/ktietz/src/ci/backcall_1611930011877/work
backports.functools-lru-cache @ file:///tmp/build/80754af9/backports.functools_lru_cache_1618170165463/work
backports.shutil-get-terminal-size @ file:///tmp/build/80754af9/backports.shutil_get_terminal_size_1608222128777/work
backports.tempfile @ file:///home/linux1/recipes/ci/backports.tempfile_1610991236607/work
backports.weakref==1.0.post1
beautifulsoup4 @ file:///home/linux1/recipes/ci/beautifulsoup4_1610988766420/work
bitarray @ file:///tmp/build/80754af9/bitarray_1620827551536/work
bkcharts==0.2
black==19.10b0
bleach @ file:///tmp/build/80754af9/bleach_1612211392645/work
bokeh @ file:///tmp/build/80754af9/bokeh_1620779595936/work
boto==2.49.0
Bottleneck==1.3.2
brotlipy==0.7.0
cachetools==4.2.4
certifi==2020.12.5
cffi @ file:///tmp/build/80754af9/cffi_1613246945912/work
chardet @ file:///tmp/build/80754af9/chardet_1607706746162/work
charset-normalizer==2.0.7
clang==5.0
click @ file:///home/linux1/recipes/ci/click_1610990599742/work
cloudpickle @ file:///tmp/build/80754af9/cloudpickle_1598884132938/work
clyent==1.2.2
colorama @ file:///tmp/build/80754af9/colorama_1607707115595/work
conda==4.10.1
conda-build==3.21.4
conda-content-trust @ file:///tmp/build/80754af9/conda-content-trust_1617045594566/work
conda-package-handling @ file:///tmp/build/80754af9/conda-package-handling_1618262148928/work
conda-repo-cli @ file:///tmp/build/80754af9/conda-repo-cli_1620168426516/work
conda-token @ file:///tmp/build/80754af9/conda-token_1620076980546/work

contextlib2==0.6.0.post1
cryptography @ file:///tmp/build/80754af9/cryptography_1616769286105/work
cycler==0.10.0
Cython @ file:///tmp/build/80754af9/cython_1618435160151/work
cytoolz==0.11.0
dask @ file:///tmp/build/80754af9/dask-core_1617390489108/work
datasets==1.4.1
decorator @ file:///tmp/build/80754af9/decorator_1617916966915/work
defusedxml @ file:///tmp/build/80754af9/defusedxml_1615228127516/work
diff-match-patch @ file:///tmp/build/80754af9/diff-match-patch_1594828741838/work
dill==0.3.4
distributed @ file:///tmp/build/80754af9/distributed_1620902833129/work
docutils @ file:///tmp/build/80754af9/docutils_1620827984873/work
entrypoints==0.3
et-xmlfile==1.0.1
fastcache==1.1.0
ffmpeg==1.4
filelock @ file:///home/linux1/recipes/ci/filelock_1610993975404/work
flake8 @ file:///tmp/build/80754af9/flake8_1615834841867/work
Flask @ file:///home/ktietz/src/ci/flask_1611932660458/work
flatbuffers==1.12
frozendict==2.1.1
frozenlist==1.2.0
fsspec==2021.10.1
future==0.18.2
gast==0.4.0
gdown==4.2.0
gevent @ file:///tmp/build/80754af9/gevent_1616770671827/work
glob2 @ file:///home/linux1/recipes/ci/glob2_1610991677669/work
gmpy2==2.0.8
google-auth==2.3.3
google-auth-oauthlib==0.4.6
google-pasta==0.2.0
googleapis-common-protos==1.53.0
greenlet @ file:///tmp/build/80754af9/greenlet_1611957705398/work
grpcio==1.41.1
h5py==3.1.0
HeapDict==1.0.1
html5lib @ file:///tmp/build/80754af9/html5lib_1593446221756/work
huggingface-hub==0.0.2
idna @ file:///home/linux1/recipes/ci/idna_1610986105248/work
imageio @ file:///tmp/build/80754af9/imageio_1617700267927/work
imagesize @ file:///home/ktietz/src/ci/imagesize_1611921604382/work
importlib-metadata @ file:///tmp/build/80754af9/importlib-metadata_1617874469820/work
iniconfig @ file:///home/linux1/recipes/ci/iniconfig_1610983019677/work
intervaltree @ file:///tmp/build/80754af9/intervaltree_1598376443606/work
ipykernel @ file:///tmp/build/80754af9/ipykernel_1596207638929/work/dist/ipykernel-5.3.4-py3-none-any.whl
ipython @ file:///tmp/build/80754af9/ipython_1617120885885/work
ipython-genutils @ file:///tmp/build/80754af9/ipython_genutils_1606773439826/work
ipywidgets @ file:///tmp/build/80754af9/ipywidgets_1610481889018/work
isort @ file:///tmp/build/80754af9/isort_1616355431277/work
itsdangerous @ file:///home/ktietz/src/ci/itsdangerous_1611932585308/work
jdcal==1.4.1
jedi @ file:///tmp/build/80754af9/jedi_1606932564285/work
jeepney @ file:///tmp/build/80754af9/jeepney_1606148855031/work
Jinja2 @ file:///tmp/build/80754af9/jinja2_1612213139570/work
jiwer==2.2.1
joblib @ file:///tmp/build/80754af9/joblib_1613502643832/work
json5==0.9.5
jsonschema @ file:///tmp/build/80754af9/jsonschema_1602607155483/work
jupyter==1.0.0
jupyter-client @ file:///tmp/build/80754af9/jupyter_client_1616770841739/work
jupyter-console @ file:///tmp/build/80754af9/jupyter_console_1616615302928/work
jupyter-core @ file:///tmp/build/80754af9/jupyter_core_1612213311222/work
jupyter-packaging @ file:///tmp/build/80754af9/jupyter-packaging_1613502826984/work
jupyter-server @ file:///tmp/build/80754af9/jupyter_server_1616083640759/work
jupyterlab @ file:///tmp/build/80754af9/jupyterlab_1619133235951/work
jupyterlab-pygments @ file:///tmp/build/80754af9/jupyterlab_pygments_1601490720602/work
jupyterlab-server @ file:///tmp/build/80754af9/jupyterlab_server_1617134334258/work
jupyterlab-widgets @ file:///tmp/build/80754af9/jupyterlab_widgets_1609884341231/work
keras==2.6.0
Keras-Preprocessing==1.1.2
keyring @ file:///tmp/build/80754af9/keyring_1614616740399/work
kiwisolver @ file:///tmp/build/80754af9/kiwisolver_1612282420641/work
lazy-object-proxy @ file:///tmp/build/80754af9/lazy-object-proxy_1616526917483/work
libarchive-c @ file:///tmp/build/80754af9/python-libarchive-c_1617780486945/work
librosa==0.8.1
llvmlite==0.36.0
locket==0.2.1
lxml @ file:///tmp/build/80754af9/lxml_1616443220220/work
Markdown==3.3.4
MarkupSafe==1.1.1
matplotlib @ file:///tmp/build/80754af9/matplotlib-suite_1613407855456/work
mccabe==0.6.1
mistune==0.8.4
mkl-fft==1.3.0
mkl-random @ file:///tmp/build/80754af9/mkl_random_1618853849286/work
mkl-service==2.3.0
mock @ file:///tmp/build/80754af9/mock_1607622725907/work
more-itertools @ file:///tmp/build/80754af9/more-itertools_1613676688952/work
mpmath==1.2.1
msgpack @ file:///tmp/build/80754af9/msgpack-python_1612287151062/work
multidict==5.2.0
multipledispatch==0.6.0
multiprocess==0.70.12.2
mypy-extensions==0.4.3
navigator-updater==0.2.1
nbclassic @ file:///tmp/build/80754af9/nbclassic_1616085367084/work
nbclient @ file:///tmp/build/80754af9/nbclient_1614364831625/work
nbconvert @ file:///tmp/build/80754af9/nbconvert_1601914830498/work
nbformat @ file:///tmp/build/80754af9/nbformat_1617383369282/work
nemo-toolkit==1.5.0
nest-asyncio @ file:///tmp/build/80754af9/nest-asyncio_1613680548246/work
networkx @ file:///tmp/build/80754af9/networkx_1598376031484/work
nltk @ file:///tmp/build/80754af9/nltk_1618327084230/work
nose @ file:///tmp/build/80754af9/nose_1606773131901/work
notebook @ file:///tmp/build/80754af9/notebook_1616443462982/work
numba @ file:///tmp/build/80754af9/numba_1616774046117/work
numexpr @ file:///tmp/build/80754af9/numexpr_1618856167419/work
numpy==1.19.5
numpydoc @ file:///tmp/build/80754af9/numpydoc_1605117425582/work
oauthlib==3.1.1
olefile==0.46
onnx==1.10.2
opencv-python==4.5.4.58
openpyxl @ file:///tmp/build/80754af9/openpyxl_1615411699337/work
opt-einsum==3.3.0
packaging @ file:///tmp/build/80754af9/packaging_1611952188834/work
pandas==1.2.4
pandocfilters @ file:///tmp/build/80754af9/pandocfilters_1605120460739/work
parso==0.7.0
partd @ file:///tmp/build/80754af9/partd_1618000087440/work
path @ file:///tmp/build/80754af9/path_1614022220526/work
pathlib2 @ file:///tmp/build/80754af9/pathlib2_1607024983162/work
pathspec==0.7.0
patsy==0.5.1
pep8==1.7.1
pexpect @ file:///tmp/build/80754af9/pexpect_1605563209008/work
pickleshare @ file:///tmp/build/80754af9/pickleshare_1606932040724/work
Pillow @ file:///tmp/build/80754af9/pillow_1617383569452/work
pkginfo==1.7.0
pluggy @ file:///tmp/build/80754af9/pluggy_1615976321666/work
ply==3.11
pooch==1.5.2
prometheus-client @ file:///tmp/build/80754af9/prometheus_client_1618088486455/work
promise==2.3
prompt-toolkit @ file:///tmp/build/80754af9/prompt-toolkit_1616415428029/work
protobuf==3.19.1
psutil @ file:///tmp/build/80754af9/psutil_1612298023621/work
ptyprocess @ file:///tmp/build/80754af9/ptyprocess_1609355006118/work/dist/ptyprocess-0.7.0-py2.py3-none-any.whl
py @ file:///tmp/build/80754af9/py_1607971587848/work
pyarrow==6.0.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycodestyle @ file:///home/ktietz/src/ci_mi/pycodestyle_1612807597675/work
pycosat==0.6.3
pycparser @ file:///tmp/build/80754af9/pycparser_1594388511720/work
pycurl==7.43.0.6
pyDeprecate==0.3.1
pydocstyle @ file:///tmp/build/80754af9/pydocstyle_1616182067796/work
pydub==0.25.1
pyerfa @ file:///tmp/build/80754af9/pyerfa_1619390903914/work
pyflakes @ file:///home/ktietz/src/ci_ipy2/pyflakes_1612551159640/work
Pygments @ file:///tmp/build/80754af9/pygments_1615143339740/work
pylint @ file:///tmp/build/80754af9/pylint_1617135829881/work
pyls-black @ file:///tmp/build/80754af9/pyls-black_1607553132291/work
pyls-spyder @ file:///tmp/build/80754af9/pyls-spyder_1613849700860/work
pyodbc===4.0.0-unsupported
pyOpenSSL @ file:///tmp/build/80754af9/pyopenssl_1608057966937/work
pyparsing @ file:///home/linux1/recipes/ci/pyparsing_1610983426697/work
pyrsistent @ file:///tmp/build/80754af9/pyrsistent_1600141720057/work
PySocks @ file:///tmp/build/80754af9/pysocks_1605305779399/work
pytest==6.2.3
python-dateutil @ file:///home/ktietz/src/ci/python-dateutil_1611928101742/work
python-jsonrpc-server @ file:///tmp/build/80754af9/python-jsonrpc-server_1600278539111/work
python-language-server @ file:///tmp/build/80754af9/python-language-server_1607972495879/work
python-Levenshtein==0.12.2
pytorch-lightning==1.5.3
pytz @ file:///tmp/build/80754af9/pytz_1612215392582/work
PyWavelets @ file:///tmp/build/80754af9/pywavelets_1601658317819/work
pyxdg @ file:///tmp/build/80754af9/pyxdg_1603822279816/work
PyYAML==5.4.1
pyzmq==20.0.0
QDarkStyle==2.8.1
QtAwesome @ file:///tmp/build/80754af9/qtawesome_1615991616277/work
qtconsole @ file:///tmp/build/80754af9/qtconsole_1616775094278/work
QtPy==1.9.0
regex @ file:///tmp/build/80754af9/regex_1617569202463/work
requests @ file:///tmp/build/80754af9/requests_1608241421344/work
requests-oauthlib==1.3.0
resampy==0.2.2
rope @ file:///tmp/build/80754af9/rope_1602264064449/work
rsa==4.7.2
Rtree @ file:///tmp/build/80754af9/rtree_1618420845272/work
ruamel-yaml-conda @ file:///tmp/build/80754af9/ruamel_yaml_1616016699510/work
ruamel.yaml==0.17.17
ruamel.yaml.clib==0.2.6
sacremoses==0.0.46
scikit-image==0.18.1
scikit-learn @ file:///tmp/build/80754af9/scikit-learn_1614446682169/work
scipy @ file:///tmp/build/80754af9/scipy_1618855647378/work
seaborn @ file:///tmp/build/80754af9/seaborn_1608578541026/work
SecretStorage @ file:///tmp/build/80754af9/secretstorage_1614022784285/work
Send2Trash @ file:///tmp/build/80754af9/send2trash_1607525499227/work
sentencepiece==0.1.96
simplegeneric==0.8.1
singledispatch @ file:///tmp/build/80754af9/singledispatch_1614366001199/work
sip==4.19.13
six @ file:///tmp/build/80754af9/six_1605205327372/work
sniffio @ file:///tmp/build/80754af9/sniffio_1614030475067/work
snowballstemmer @ file:///tmp/build/80754af9/snowballstemmer_1611258885636/work
sortedcollections @ file:///tmp/build/80754af9/sortedcollections_1611172717284/work
sortedcontainers @ file:///tmp/build/80754af9/sortedcontainers_1606865132123/work
SoundFile==0.10.3.post1
soupsieve @ file:///tmp/build/80754af9/soupsieve_1616183228191/work
Sphinx @ file:///tmp/build/80754af9/sphinx_1620777493457/work
sphinxcontrib-applehelp @ file:///home/ktietz/src/ci/sphinxcontrib-applehelp_1611920841464/work
sphinxcontrib-devhelp @ file:///home/ktietz/src/ci/sphinxcontrib-devhelp_1611920923094/work
sphinxcontrib-htmlhelp @ file:///home/ktietz/src/ci/sphinxcontrib-htmlhelp_1611920974801/work
sphinxcontrib-jsmath @ file:///home/ktietz/src/ci/sphinxcontrib-jsmath_1611920942228/work
sphinxcontrib-qthelp @ file:///home/ktietz/src/ci/sphinxcontrib-qthelp_1611921055322/work
sphinxcontrib-serializinghtml @ file:///home/ktietz/src/ci/sphinxcontrib-serializinghtml_1611920755253/work
sphinxcontrib-websupport @ file:///tmp/build/80754af9/sphinxcontrib-websupport_1597081412696/work
spyder @ file:///tmp/build/80754af9/spyder_1616775618138/work
spyder-kernels @ file:///tmp/build/80754af9/spyder-kernels_1614030590686/work
SQLAlchemy @ file:///tmp/build/80754af9/sqlalchemy_1620712430742/work
statsmodels @ file:///tmp/build/80754af9/statsmodels_1614023746358/work
sympy @ file:///tmp/build/80754af9/sympy_1618252284338/work
tables==3.6.1
tblib @ file:///tmp/build/80754af9/tblib_1597928476713/work
tensorboard==2.7.0
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.0
tensorflow==2.6.1
tensorflow-addons==0.14.0
tensorflow-datasets==3.2.1
tensorflow-estimator==2.6.0
tensorflow-metadata==1.4.0
termcolor==1.1.0
terminado==0.9.4
testpath @ file:///home/ktietz/src/ci/testpath_1611930608132/work
textdistance @ file:///tmp/build/80754af9/textdistance_1612461398012/work
tf2crf==0.1.33
threadpoolctl @ file:///tmp/tmp9twdgx9k/threadpoolctl-2.1.0-py3-none-any.whl
three-merge @ file:///tmp/build/80754af9/three-merge_1607553261110/work
tifffile==2020.10.1
tokenizers==0.10.3
toml @ file:///tmp/build/80754af9/toml_1616166611790/work
toolz @ file:///home/linux1/recipes/ci/toolz_1610987900194/work
torch @ file:///home/omer/torch-1.8.0.dev20210208%2Bcu110-cp38-cp38-linux_x86_64.whl
torchaudio @ file:///home/omer/torchaudio-0.8.0.dev20210208-cp38-cp38-linux_x86_64.whl
torchmetrics==0.6.0
torchtext @ file:///home/omer/torchtext-0.9.0.dev20210208-cp38-cp38-linux_x86_64.whl
torchvision @ file:///home/omer/torchvision-0.9.0.dev20210208%2Bcu110-cp38-cp38-linux_x86_64.whl
tornado @ file:///tmp/build/80754af9/tornado_1606942300299/work
tqdm==4.49.0
traitlets @ file:///home/ktietz/src/ci/traitlets_1611929699868/work
transformers==4.4.0
typed-ast @ file:///tmp/build/80754af9/typed-ast_1610484547928/work
typeguard==2.13.0
typing-extensions @ file:///home/ktietz/src/ci_mi/typing_extensions_1612808209620/work
ujson @ file:///tmp/build/80754af9/ujson_1611259522456/work
unicodecsv==0.14.1
Unidecode==1.3.2
urduhack==1.1.1
urllib3 @ file:///tmp/build/80754af9/urllib3_1615837158687/work
watchdog @ file:///tmp/build/80754af9/watchdog_1612471027849/work
wcwidth @ file:///tmp/build/80754af9/wcwidth_1593447189090/work
webencodings==0.5.1
Werkzeug @ file:///home/ktietz/src/ci/werkzeug_1611932622770/work
wget==3.2
widgetsnbextension==3.5.1
wrapt==1.12.1
wurlitzer @ file:///tmp/build/80754af9/wurlitzer_1617224664226/work
xlrd @ file:///tmp/build/80754af9/xlrd_1608072521494/work
XlsxWriter @ file:///tmp/build/80754af9/xlsxwriter_1617224712951/work
xlwt==1.3.0
xmltodict==0.12.0
xxhash==2.0.2
yapf @ file:///tmp/build/80754af9/yapf_1615749224965/work
yarl==1.7.2
zict==2.0.0
zipp @ file:///tmp/build/80754af9/zipp_1615904174917/work
zope.event==4.5.0
zope.interface @ file:///tmp/build/80754af9/zope.interface_1616357211867/work

Colab Notebook

I’ve also tried this on different batch sizes and more epochs. Getting the same result. Oddly though, the same notebook used to work a while back.

I found that I had this problem when I had too many trainable parameters and not enough training data. It might help to freeze more parameters in the model.

Did you fix this? I also found this problem, the same code but the different results…

I am also facing the same issue. I am fine-tuning the facebook/wav2vec2-xls-r-300m model over a low-resource language dataset. The WER metric does not change from 1.00 and the predictions are empty strings. I am using transformers==4.24.0 . I have tried setting the parameter do_normalization of Wav2Vec2FeatureExtractor to False, as there was an associated issue on github. However, this didn’t work out.

image

This is my model:

model = Wav2Vec2ForCTC.from_pretrained(
    "facebook/wav2vec2-xls-r-300m",
    ctc_loss_reduction="mean",
    bos_token_id=processor.tokenizer.bos_token_id,
    eos_token_id=processor.tokenizer.eos_token_id,
    pad_token_id=processor.tokenizer.pad_token_id,
    vocab_size=len(processor.tokenizer),
)
model.freeze_feature_encoder()

These are my training args:

training_args = TrainingArguments(
  output_dir=repo_name,
  do_train = True,
  learning_rate=9.24e-05,
  # params for evaluation
  do_eval=True,
  evaluation_strategy = "epoch",
  group_by_length=True,
  per_device_train_batch_size=16,
  gradient_accumulation_steps=2,
  num_train_epochs=30,
  gradient_checkpointing=True,
  #fp16=True, # available for cuda
  save_steps=400,
  warmup_steps=500,
  save_total_limit=2,
  push_to_hub=True,
  overwrite_output_dir = True,
  # parameter for logs per epochs
  logging_strategy="epoch",
  logging_dir='./logs', 
  log_level = "debug",
  report_to="wandb", 
  run_name="quechua-30",
  # params for logs per steps
  #eval_steps=400,
  #logging_steps=400,
)
trainer = Trainer(
    model=model,
    data_collator=data_collator,
    args=training_args,
    compute_metrics=compute_metrics,
    train_dataset=quechua["train"],
    eval_dataset=quechua["dev"],
    tokenizer=processor.feature_extractor,
    callbacks=[MyCallback]
)
trainer.train()

Hello!
May I ask did you solve the problem?
I have the exact same issue that you wrote. Had search a lot, but didn’t find solutions that worked out. :face_with_head_bandage:

1 Like

Hi all. I experienced this problem too. In my case I’m pretty sure this is what happens when you are badly over-training the model. If you look at your logits in the output, you’ll probably find that the first logit is always largest, which I think is code for “nothing”. I tried training using a very very small dataset (maybe 100 cases), logging at each step. For the first couple of steps, the WER is below 1, then it eventually goes to 1 and gives that empty transcript, suggesting over-training. So, at the moment I’m doing a hyperparameter search to identify a stable learning rate and weight decay. In a nutshell, I think the answer is probably to turn down your learning rate.

Same issue getting 1 WER

Any updates?
I am also getting the same issue.
WER remains at 1 during training, and I’m getting empty transcripts during inference.