Hello
I want to simulate a user-agent conversation for an experiment. For example, both LLMs are given system prompts s_u and s_a describing their behaviour and it generates the first utterance of the conversation (say u1). The agent has access to u1 and s_a, and comes up with its response a1. The user, with access to s_u, u1 and a1, generates u2, and so on. How can I achieve this in Huggingface in the most efficient way? This is how I think it should go.
import transformers
s_u = <>
s_a = <>
user_history = []
agent_history = []
pipe = transformers.pipeline('text-generation', model='meta-llama/Meta-Llama-3.1-8B-Instruct')
T = 10 # number of conversation turns
user_history.append({'role':'system', 'content':s_u})
agent_history.append({'role':'system', 'content':s_a})
for t in range(T):
if t%2:
output = pipe(user_history, **kwargs)[0]['generated_text']
user_history.append({'role':'agent', 'content': output})
agent_history.append({'role':'user', 'content': output})
else:
output = pipe(agent_history, **kwargs)[0]['generated_text']
user_history.append({'role':'user', 'content': output})
agent_history.append({'role':'agent', 'content': output})
Essentially, for N turns, the conversation alternates between the user and the agent, and the histories are updated accordingly. I haven’t been able to test this out yet, but I appreciate any comments on the method. Should I be using a different pipe
for the agent and the user? Also, is there a way I can introduce an early termination of the chat (ie. in less than 10 turns)