Hello.
I am finetuning wav2vec “wav2vec2-large-lv60
“ using my own dataset. I followed Patrick’s tutorial (Fine-Tune Wav2Vec2 for English ASR in Hugging Face with 🤗 Transformers) and successfully finished the finetuning (thanks for very nice tutorial.)
Now, I would like to run decoding with a language model and have a few questions.
- Can we run decoding with a language model directly from huggingface?
- If not, how can I get the wave2vec model compatible to the fairseq decoding script (fairseq/examples/speech_recognition/infer.py)?
I did the following steps, but it failed:
-
Create ‘.pt’ file from the finetuning checkpoint
def save_model(my_checkpoint_path):
model = Wav2Vec2ForCTC.from_pretrained(my_checkpoint_path)
torch.save(model.state_dict(), my_model.pt) -
Decoding
I used the decoding step command from the following webpage https://github.com/pytorch/fairseq/blob/master/examples/wav2vec/README.md#evaluating-a-ctc-model
$subset=dev_other
python examples/speech_recognition/infer.py /checkpoint/abaevski/data/speech/libri/10h/wav2vec/raw --task audio_pretraining
–nbest 1 --path /path/to/model --gen-subset $subset --results-path /path/to/save/results/for/sclite --w2l-decoder kenlm
–lm-model /path/to/kenlm.bin --lm-weight 2 --word-score -1 --sil-weight 0 --criterion ctc --labels ltr --max-tokens 4000000
–post-process letter
I replaced /path/to/model with “my_model.pt”.
Then, I am getting the following error message.
Traceback (most recent call last):
File “/mount/fairseq/examples/speech_recognition/infer.py”, line 427, in
cli_main()
File “/mount/fairseq/examples/speech_recognition/infer.py”, line 423, in cli_main
main(args)
File “/mount/fairseq/examples/speech_recognition/infer.py”, line 229, in main
models, saved_cfg, task = checkpoint_utils.load_model_ensemble_and_task(
File “/mount/fairseq/fairseq/checkpoint_utils.py”, line 370, in load_model_ensemble_and_task
state = load_checkpoint_to_cpu(filename, arg_overrides)
File “/mount/fairseq/fairseq/checkpoint_utils.py”, line 304, in load_checkpoint_to_cpu
state = _upgrade_state_dict(state)
File “/mount/fairseq/fairseq/checkpoint_utils.py”, line 456, in _upgrade_state_dict
{“criterion_name”: “CrossEntropyCriterion”, “best_loss”: state[“best_loss”]}
KeyError: ‘best_loss’
When I googled it, this seems relevant to removal of the optimization history logs:
This happens because we remove the useless optimization history logs from the model to reduce the file size. Only the desired model weights are kept to release. As a result, if you directly load the model, error will be reported that some logs are missed.
So how can I save the finetuning model compatible to “fairseq”. Should I store the optimization history? If yes, how can I do it? Does anyone have same experience? If yes, could you please share it with me? Thank you always.