Dialogpt with irrelevant and weird response

Olive0982 · February 28, 2025, 9:02am

Hi guys, currently I am a really fresh junior which is now working on finetuning DialoGPT with my own dataset to develop a conversational chatbot, but yet I found that the response generated is very irrelevant. At first I thought that maybe is my dataset problem, and I try to change it to larger dataset but it still not working.

So, I try the original DialoGPT to check if is the model problem, and I found the response generated is also very weird like the response below. So is it the base model problem or just my technical problem? I actually think off changing into other model like GPT2 model (which can be finetuned in Google Colab T4GPU), but I have also try for GPT2 model inference before finetuning, but it also generate something weird like i input “Hi” , it responses with the following. If anyone can point out what am I missing or doing wrong I will be really appreciated. Thanks in advance.

Chatbot: , “I know you’re a great person and you’re here to do what’s right.”

“No, I’m not,” said I, “I’m not here to do what’s right.”

“No, I’m not here to do what’s right,” said I, “I’m not here to do what’s right.”

“No, I’m not here to do what’s right.”

"No, I’m not here to do what’s right

Response from DialoGPT

User:do you have a good day
DialoGPT: I do, thank you.
User:i feel not bad today also
DialoGPT: I feel good today.
User:i done a bad job in my last year
DialoGPT: i feel bad today
User:can you give me some adavice?
DialoGPT: i feel bad today

The code is getting from other side, I just adjust for the top_p and top_k value.

import torch

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

tokenizer = AutoTokenizer.from_pretrained(“microsoft/DialoGPT-large”)
model = AutoModelForCausalLM.from_pretrained(“microsoft/DialoGPT-large”)

for step in range(5):
new_user_input_ids = tokenizer.encode(input(“>> User:”) + tokenizer.eos_token, return_tensors=‘pt’)
print(f’user_token:{new_user_input_ids}')
bot_input_ids = torch.cat([chat_history_ids, new_user_input_ids], dim=-1) if step > 0 else new_user_input_ids

chat_history_ids = model.generate(
    bot_input_ids,
    max_length=2000,
    top_k=50, 
    top_p=0.9,
    pad_token_id=tokenizer.eos_token_id,
    )
print(f'chat_history_ids:{bot_input_ids}')
print("DialoGPT: {}".format(tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)))

John6666 · February 28, 2025, 11:09am

#bot_input_ids = torch.cat([chat_history_ids, new_user_input_ids], dim=-1) if step > 0 else new_user_input_ids
bot_input_ids = new_user_input_ids

The main cause seems to be the line above. The conversation history is not being processed as a conversation history. Since the Transformers specification has changed since Microsoft wrote the sample, I’ve tried rewriting it in a more modern style.

It’s much better now, but I think the model itself is strange… especially with the default settings.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

device = "cuda" if torch.cuda.is_available() else "cpu"
tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-large", torch_dtype=torch.bfloat16)
model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-large").to(device)

questions = ["do you have a good day", "i feel not bad today also", "i done a bad job in my last year", "can you give me some adavice?"]
history = []

for q in questions:
    history.append({"role": "user", "content": q})
    msg = tokenizer.apply_chat_template(history, tokenize=False, add_generation_prompt=True)
    new_user_input_ids = tokenizer.encode(msg + tokenizer.eos_token, return_tensors='pt')
    bot_input_ids = new_user_input_ids

    chat_history_ids = model.generate(
        bot_input_ids.to(device),
        max_new_tokens=1024,
        do_sample=True,
        temperature=0.7,
        top_k=50,
        top_p=0.9,
        pad_token_id=tokenizer.eos_token_id,
        )
    
    output = tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)
    history.append({"role": "assistant", "content": output})

    print("User: {}".format(q))
    print("DialoGPT: {}".format(output))

User: do you have a good day
DialoGPT: You're pretty bad at trolling, are you?
User: i feel not bad today also
DialoGPT: You are a good troll.
User: i done a bad job in my last year
DialoGPT: I think you're doing a good job.
User: can you give me some adavice?
DialoGPT: yes, but it's a little bit tough to get

system · March 5, 2025, 5:27am

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Non-meaningful response from finetuned GPT-2 model 🤗Transformers	0	442	June 26, 2023
Weak Conversational Skills - dialogPT trained model issue Beginners	4	814	March 12, 2023
GPT-2 trained models output repeated "!" Beginners	2	2787	December 20, 2021
Fine tuning GPT2 on persona chat dataset outputs gibberish Models	1	2724	April 14, 2021
Using DialoGPT for Text Classification Beginners	1	1012	December 29, 2021

Dialogpt with irrelevant and weird response

Response from DialoGPT

Related topics