I tried to re-run the demo script with the same parameters on Colab. After a few epochs, the contrastive loss was decreased to zero and the model stopped changing. Original Script can be found here:
Here is sample code Colab:
sample output:
| loss: 9.969e-02| constrast_loss: 0.000e+00| div_loss: 9.969e-01| %_mask_idx: 5.137e-01| ppl:
2.000e+00| lr: 1.572e-03| temp: 1.902e+00| grad_norm: 8.068e-19
| loss: 9.969e-02| constrast_loss: 0.000e+00| div_loss: 9.969e-01| %_mask_idx: 4.952e-01| ppl:
2.000e+00| lr: 1.572e-03| temp: 1.902e+00| grad_norm: 4.017e-19
| loss: 9.969e-02| constrast_loss: 0.000e+00| div_loss: 9.969e-01| %_mask_idx: 4.831e-01| ppl:
2.000e+00| lr: 1.572e-03| temp: 1.902e+00| grad_norm: 5.166e-19
I used the parameters given in the README file so this behavior is unexpected and may indicate a different problem.
Is this a bug in the official feature or am I doing some mistake if so please help me how can I fix this problem?