GPT chatbot challenge

I know the interest of chatbots have increased dramatically and I am stuck in the fray of use GPT-2 and GPT-3 text-generation for chat bot communication.

I am trying to figure out the best way to use GPT for either seeding response or creating all new responses, I am not sure. Basically from what I have gathered from GPT is seeding the text, so we can have it start picking up the scene and dialog between two people.

My approach has been to describe the people involved then seed it what is happening and including a bit of history as it goes. Then I leave the bot prompt empty.

I used “EleutherAI/gpt-neo-1.3B’”

Like this:
Scott is me and Alexi is the bot.

Scott is a male
Scott is 36
Alexi is a female
Alexi is 34
This is a conversation between Scott and Alexi
Scott: hello
Alexi: hi
Scott: hows it going
Alexi: you?
Scott: what you up to
Alexi: i just had a long day at work and i'm tired
Scott: awww im sorry
Alexi: Alexi: you know what?
Scott: what?
Alexi: i'm sorry
Scott: for what?
Alexi: i was just saying to you that i'm really sorry
Scott: usually you are sorry about something
Alexi: you were talking to me about a long time ago
Scott: ummm no I wasn't
Alexi: i'm sorry
Scott: lol ok
Alexi: i was just saying to you that i'm really sorry
Scott: lets change the subject
Alexi: umm
Scott: you go first

Although this does ok at first, overtime I get either repeated responses or goes in a “loop” of sorts and actually starts making less sense like this.

Scott is a male
Scott is 36
Alexi is a female
Alexi is 22
Scott: hello how are you
Alexi: how are you
Scott: what you doing?
Alexi: I'm a little tired
Scott: awww im sorry
Alexi: ive been up since 6:00am
Scott: oh my goodness you didn't sleep
Alexi: ive been up since 5:30am
Scott: i thought you just said 6
Alexi: ive been up since 4:30am
Scott: I give up

So my question is anyone making a better approach to this application. I know Replika has a much more controlled approached and other ways of ranking. Looking more for the QuickChat approach of being more open-ended.

So my question is there a better way to structure the input of GPT to get better responses?

Hello @AtherionGG and welcome to our Forum!

Seems that you look for a good trade-off between sense and variation in your model. I’d suggest you this awesome blog post to tune for decoding in generative models.

Ya, I read through that. Which is a good article.
I also went through this one Guiding Text Generation with Constrained Beam Search in 🤗 Transformers
I’ll play with the model parameters more this weekend and see if I can get better results.