System Info
I am trying to train a seq2seq model using EncoderDecoderModel class and found this blog very helpful. Thanks to @patrickvonplaten for his excellent explanation. Following this blog I fine-tuned a seq2seq model where I used a BERT ([BanglaBERT] an Electra) model as encoder and [XGLM] as decoder using [BanglaParaphrase] data. But after fine-tuning the model always generates an empty string or garbage output. Now I do not understand where the problem is. Can anyone please help me find the bug in the code.
Thanks.
Expected behavior
Input-output for my code:
{âtargetâ: âāϏāĻŋāĻĒāĻŋāĻ āĻāĻšāϤ āĻĨāĻžāĻāĻžāϝāĻŧ āϝā§āĻĻā§āϧ āĻĒāϰāĻŋāĻāĻžāϞāύāĻžāϰ āĻĻāĻžāϝāĻŧāĻŋāϤā§āĻŦ āĻāϏ⧠āĻĒāĻĄāĻŧā§āĻāĻŋāϞ āϏā§āĻŽā§āĻĒā§āϰā§āύāĻŋāϝāĻŧāĻžāϏā§āϰ āĻāĻžāĻāϧā§āĨ¤â,
âpred_targetâ: ââ}
which should be something like this (should give the paraphrased sentence according to the input sentence in Bangla):
{âtargetâ: âāϏāĻŋāĻĒāĻŋāĻ āĻāĻšāϤ āĻĨāĻžāĻāĻžāϝāĻŧ āϝā§āĻĻā§āϧ āĻĒāϰāĻŋāĻāĻžāϞāύāĻžāϰ āĻĻāĻžāϝāĻŧāĻŋāϤā§āĻŦ āĻāϏ⧠āĻĒāĻĄāĻŧā§āĻāĻŋāϞ āϏā§āĻŽā§āĻĒā§āϰā§āύāĻŋāϝāĻŧāĻžāϏā§āϰ āĻāĻžāĻāϧā§āĨ¤â,
âpred_targetâ: âāϏāĻŋāĻĒāĻŋāĻ āĻāϰā§āϤā§āĻ āĻāĻšāϤ āĻšāϝāĻŧā§ āϏā§āĻŽāĻĒā§āϰā§āύāĻŋāϝāĻŧāĻžāϏā§āϰ āĻāĻžāĻāϧ⧠āϝā§āĻĻā§āϧ āĻĒāϰāĻŋāĻāĻžāϞāύāĻžāϰ āĻĻāĻžāϝāĻŧāĻŋāϤā§āĻŦ āĻāϏā§āĨ¤â}