Q&A chatbot using timeseries data (Need efficient approach)

Hi All,

I need your suggestions. I want to build a QnA chatbot. I have structured time-series data in a PostgreSQL database, including minute, hourly, daily, ML results, Recommendations and Anomalies data.

I want to understand the best way to implement QnA chatbot. Do I need to use :

RAG approach to convert all this data to vectors? or APIs to aggregate the data and pass it directly to an LLM? or is there any efficient approach?

Thank you for the help!

1 Like

If an approach combining RAG and SQL is possible, that would be desirable.

If structured data already exists, especially data containing numbers, it would be a waste to simply vectorize all of it. The embedding used for RAG retrieval is basically designed for text. It is not good at accurately handling numbers that have been vectorized as-is.

This Q and A dataset that I created might serve as an example. I utilized 3 tags “question:”, “context:” and “answer:”. Along with an anchor point tag “|”. This way it avoids creating noise in the data and provides a pattern the bot can recognize.

I can setup a link with the training script, vocab and dataset if you’d like. Could save you a lot of time.

1 Like