Model for Postgres

bajrangbali · October 10, 2023, 8:35am

I have a postgres database with 200+ Tables. Each table contains information about my supply inventory. It also contains columns which are JSON and there are nested JSON as well. There are relationships as well and some relationships are based on the values in the JSON Column in one table pointing to another Column in Another Table.

I use to do ETL to flat it, then query and generate report.

Recently I tried to solve this problem by LLM, I have installed and tried LLama-2 model on one of my EC2 machine. But I am not able to achieve the result which I thought. I wanted to chat with my database.

I have following questions in order to complete this entire activity. It will be great if you people can guide me for my next steps and other information.

One of the major problem is I face token limit error.
- I understand that there is token limit for each LLM, but I am still wondering or not able to understand that do I need to really or explicitly create Embedding for my data or do I need to normalize or de-normalize data or something else I need to do.
The existing solution which I have created or working on is working, but it is not consistent
- How can I make my solution more consitent
The existing solution which I have created gives result sometime correctly and sometimes incorrect or it not able to understand data
- How can I make model understand my data for complex JSON or other columns, for joins etc
Increasing Datasize
- How can I manage my model or solution to cater the increasing datasize in near future
Is there something to do with the Model
- Is there any other Model which I should try or use which is suitable for my usecase

These are the some high level queries which I have. It will be great if someone can help me out understand the details here.

bajrangbali · October 11, 2023, 11:44am

Appreciate help here.

MattiLinnanvuori · October 12, 2023, 7:00pm

You could create embeddings from your data and store the embeddings in a database. Then you could fetch the text closest in the embedding space from queries to the LLM as a prompt. You could store the embeddings from a passage of text that is within the LLM token limits. https://research.ibm.com/blog/retrieval-augmented-generation-RAG

bajrangbali · October 13, 2023, 1:47pm

Thanks for the response.

Topic		Replies	Views
LLM model for table data Languages at Hugging Face	8	41197	July 21, 2024
Understanding regarding "Question Answering model using open-source LLM" Beginners	0	1022	May 3, 2023
Using LLMs word embeddings within context Models	2	1178	January 25, 2024
Embedding Model for vectorizing json files Beginners	0	3778	January 16, 2024
Embedding structured data Models	0	391	May 19, 2024

Model for Postgres

Related topics