System Info
I am trying to train a seq2seq model using EncoderDecoderModel class and found this blog very helpful. Thanks to @patrickvonplaten for his excellent explanation. Following this blog I fine-tuned a seq2seq model where I used a BERT ([BanglaBERT] an Electra) model as encoder and [XGLM] as decoder using [BanglaParaphrase] data. But after fine-tuning the model always generates an empty string or garbage output. Now I do not understand where the problem is. Can anyone please help me find the bug in the code.
Thanks.
Expected behavior
Input-output for my code:
{âtargetâ: âāĻ¸āĻŋāĻĒāĻŋāĻ āĻāĻšāĻ¤ āĻĨāĻžāĻāĻžāĻ¯āĻŧ āĻ¯ā§āĻĻā§āĻ§ āĻĒāĻ°āĻŋāĻāĻžāĻ˛āĻ¨āĻžāĻ° āĻĻāĻžāĻ¯āĻŧāĻŋāĻ¤ā§āĻŦ āĻāĻ¸ā§ āĻĒāĻĄāĻŧā§āĻāĻŋāĻ˛ āĻ¸ā§āĻŽā§āĻĒā§āĻ°ā§āĻ¨āĻŋāĻ¯āĻŧāĻžāĻ¸ā§āĻ° āĻāĻžāĻāĻ§ā§āĨ¤â,
âpred_targetâ: ââ}
which should be something like this (should give the paraphrased sentence according to the input sentence in Bangla):
{âtargetâ: âāĻ¸āĻŋāĻĒāĻŋāĻ āĻāĻšāĻ¤ āĻĨāĻžāĻāĻžāĻ¯āĻŧ āĻ¯ā§āĻĻā§āĻ§ āĻĒāĻ°āĻŋāĻāĻžāĻ˛āĻ¨āĻžāĻ° āĻĻāĻžāĻ¯āĻŧāĻŋāĻ¤ā§āĻŦ āĻāĻ¸ā§ āĻĒāĻĄāĻŧā§āĻāĻŋāĻ˛ āĻ¸ā§āĻŽā§āĻĒā§āĻ°ā§āĻ¨āĻŋāĻ¯āĻŧāĻžāĻ¸ā§āĻ° āĻāĻžāĻāĻ§ā§āĨ¤â,
âpred_targetâ: âāĻ¸āĻŋāĻĒāĻŋāĻ āĻāĻ°ā§āĻ¤ā§āĻ āĻāĻšāĻ¤ āĻšāĻ¯āĻŧā§ āĻ¸ā§āĻŽāĻĒā§āĻ°ā§āĻ¨āĻŋāĻ¯āĻŧāĻžāĻ¸ā§āĻ° āĻāĻžāĻāĻ§ā§ āĻ¯ā§āĻĻā§āĻ§ āĻĒāĻ°āĻŋāĻāĻžāĻ˛āĻ¨āĻžāĻ° āĻĻāĻžāĻ¯āĻŧāĻŋāĻ¤ā§āĻŦ āĻāĻ¸ā§āĨ¤â}