Pre-training for Wav2Vec2-XLSR via Huggingface

Hi guys! I note that the most topics are related to fine-tuning a pre-trained model. But if I have got some new unlabeled data, how can I preform the pre-training process via Huggingface?

2 Likes

Hey Javen,

We’ve now an official wav2vec2-pretraining example here: transformers/examples/pytorch/speech-pretraining at master · huggingface/transformers · GitHub

1 Like

Hi @patrickvonplaten, I re-run this scripts on Google colab. I pass all parameter as same as you recommend on README but after some epoch, the loss is not decrease.

| loss: 9.969e-02| constrast_loss: 0.000e+00| div_loss: 9.969e-01| %_mask_idx: 5.137e-01| ppl: 2.000e+00| lr: 1.572e-03| temp: 1.902e+00| grad_norm: 8.068e-19
| loss: 9.969e-02| constrast_loss: 0.000e+00| div_loss: 9.969e-01| %_mask_idx: 4.952e-01| ppl: 2.000e+00| lr: 1.572e-03| temp: 1.902e+00| grad_norm: 4.017e-19
| loss: 9.969e-02| constrast_loss: 0.000e+00| div_loss: 9.969e-01| %_mask_idx: 4.831e-01| ppl: 2.000e+00| lr: 1.572e-03| temp: 1.902e+00| grad_norm: 5.166e-19

Can you take a look? what did I miss?

1 Like

Hi @patrickvonplaten and @tiena2cva,
Thanks for the new official wav2vec2-pretraining example, this helps a lot!
I had the same problem as @tiena2cva. Tried to re-run the demo script with the same parameters on my own gpu. After a few epochs the contrastive loss was decreased to zero and the model stopped changing.
Running inference showed that the quantizer maps all the time steps to the same vector (can be seen at projected_quantized_states), which explains the zero contrastive loss.
I would have thought that the diversity loss weight should be increased, but I used the parameters given in the README file so this behavior is unexpected and may indicate a different problem.

Can you please help?

3 Likes

Hi @patrickvonplaten and @sgugger , I am facing the same issue, constrast_loss is not changing after 3 backward it goes straight to zero and didn’t change even after many epochs, any idea how can we reproduce the results. Thanks

Hi, @tiena2cva and @ayanas any update or solution to your problem?

hey @tiena2cva . Were you able to pre train your model? If yes, can you share your code?

Hi @Kshitizkhandelwal, it was a long time ago, let me check in the next day. I have done pre-training wav2vec2 for Vietnamese

@tiena2cva If possible please let me know on this?

Good day! I am consnantly getting an error “realloc of size … failed” and I wanted to know is there any solution to overpass it?

I have the same situation here.
I tried to pretrain wave2vec on 8K and 16K samples. after few steps contrastive loss goes 0, diversity loss shoots up to 1 and perplexity to 2.
I also tried changing hyper params like learning rate and gumble temp, but no luck
any updates ?
@tiena2cva @patrickvonplaten

1 Like

Do you have any updates? I’m experiencing a similar issue where the diversity loss starts off very high, around 1500, but after the warmup period, it quickly drops down to 0.0 within 10K steps. Conversely, the diversity loss starts at around 300 and gradually increases to 400, then oscillates.

Hi @mohammadtvk I am experiencing the same problem did you find a solution?

unfortunately no. I used fairseq for pretraining

I also switched to fairseq, there are several LR and threshold management things missing in HF I think.

Hi guys,
Wav2vec2 paper mentions some customization to the training loop, like gradient scaling, penalty on feature encoder, temperature annealing. I am not sure if they are implemented in the example script or not. Here is a tutorial that lets you customize the training loop by extending the Huggingface Trainer class. The tutorial extends one method of Trainer class but also points to more resources to implement full functionality.

This might be useful for someone.

1 Like