Ensuring Llamaindex to reply with a predetermined answer with CSV file?

kamiyan89 · May 14, 2023, 4:35am

Hi Everyone.

I am currently following this tutorial “How to Train an AI Chatbot With Custom Knowledge Base Using ChatGPT API | Beebom”

I have dome some tests and it works really well with PDFs.

Now, I want to create a Q&A chatbot.
I Created a CSV file with the common questions, and the answers to these questions

But when I train the model with the CSV file I created, and ask for example question 1, it replies with answer 2.

Is there a way to ensure that Questions are tied to their correspondent answers ? so the model doesn’t mix them together?

My code is as follows ( I think this is more of a conceptual question than a code question but just in case)

import os
from llama_index import SimpleDirectoryReader, GPTListIndex, GPTVectorStoreIndex, LLMPredictor, PromptHelper, ServiceContext, StorageContext, load_index_from_storage
#from langchain.chat_models import ChatOpenAI
from langchain import OpenAI
import gradio as gr
import random
import time
import sys

os.environ["OPENAI_API_KEY"] = 'my key'
messages = [{"role": "system", "content": "follow the three instructions below for your outputs:"},
{"role": "system", "content": "1. make sure all expressions are compatible with Japanese"},
{"role": "system", "content": "2. replying in English is extrictly forbidden"},
{"role": "system", "content": "3. use Japanese only for outputs"}
]


    
def load_index():
    max_input_size = 4096
    max_chunk_overlap = 20
    chunk_size_limit = 600
    num_outputs = 768
    prompt_helper = PromptHelper(max_input_size, num_outputs, max_chunk_overlap, chunk_size_limit=chunk_size_limit)
    llm_predictor = LLMPredictor(llm=OpenAI(temperature=0.4, model_name="text-davinci-003", max_tokens=num_outputs))
    global index
    service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper)
    index = load_index_from_storage('index.json',service_context=service_context)
    return ("indexing finished")

def chat(chat_history, user_input):
  bot_response = index.query(user_input,response_mode="compact")
  print("Q:",user_input)
  response = ""
  for letter in ''.join(bot_response.response):
      response += letter + ""
      yield chat_history + [(user_input, response)]
  print("A:",response)



with gr.Blocks() as demo:
    gr.Markdown('AI chat(β 0.1)')
    load_index()
    with gr.Tab("chatbot"):
          chatbot = gr.Chatbot()
          message = gr.Textbox ()
          message.submit(chat, [chatbot, message], chatbot)


demo.queue(max_size=100).launch(share= True)```

kamiyan89 · May 16, 2023, 10:25am

Gentle bump

kamiyan89 · May 18, 2023, 3:48pm

Answering my own question, by changing from vector index to panda index it works now
https://gpt-index.readthedocs.io/en/latest/examples/index_structs/struct_indices/PandasIndexDemo.html

Topic		Replies	Views
How to create Q&A chatbot with CSV file Beginners	4	4749	January 24, 2024
How to improve my ai bot for get more accurate answer? Beginners	0	40	October 13, 2024
Using Pandas to create a Q&A chatbot Beginners	0	560	May 23, 2023
Need advice for a new project Beginners	0	313	February 6, 2024
Issue with LLaMA-3 Fine-Tuning: Model Generates Correct Answer but Then Adds Unrelated Questions 🤗AutoTrain	5	335	April 8, 2025

Ensuring Llamaindex to reply with a predetermined answer with CSV file?

Related topics