Weak Conversational Skills - dialogPT trained model issue

I noticed in most / all dialogPT tutorials, when somebody trains on top of it with their own data, the answers they get back from it always turn into “!!!?!?!!;,!.com?!” - “!!!” - “”, and stuff like that after about 3-5 questions. I also had this problem in my own training code. Why is that?

1 Like

From my experience this correlates with:

  1. Lack of fine-tuning for your specific length. I don’t know why that is the case but I have noticed a significant drop in this “!!!?!?!!;,!.com?!” thing once you increase the fine-tuning dataset size.
  2. This seems to only occur on dialoGPT-small. Have not seen it once on the medium version. This is not that big a deal since if you can train dialoGPT-small, generaly you will be able to train dialoGPT-mid on the same GPU.

P.S. You had me confused for a second there :blush:. It’s not “dialogPT” it’s dialoGPT as it’s based on the GPT-2 model.

“Lack of fine-tuning for your specific length” could you specify what you mean? more on the data cleaning / structuring side, or on the training side, training differently or for more checkpoints? FWIW i have a beefy 3090 and see the same thing on medium and large :sweat_smile:

Also, are you available for consultation? would be great to discuss this in detail for 3-4 hours.