Best Model for sorting a dataset based on a user question

Hi ,
I have a dataset of around 100 tours which has fields like start location, end location , title , counties visted.

I’m trying to get a model to sort the dataset based on a use question like “find me a tour which starts in parise and goes to italy”

Current results are ok but the model i’m using doesn’t output all the dataset and the ranking is not the best.

I’m looking for some suggestions on models to try . I have tried llama 3.2 qwen2.5-coder.

I also need the output in json form

any suggestions would be great.

thanks

1 Like

This is not to say that this is the best solution in the current situation, but I remembered this when I saw your example.
Well, LLM is not good at accurately handling large amounts of formalized data within LLM, so I think it would be ideal if you could implement some kind of tool (function) that calls from LLM using frameworks such as smolagents, LangChain, and Ollama.

1 Like

Thanks for your post.
I currently pre-process the tour data (our current databse is around 12,000 tours) using an embedding model of key fields to create the vectors which are stored in the database and then generate embeddings on the user input. This works really well and is producing results better than the freetext search method I was using.

The problem is the results from the vector search don’t take into account all the properties of the tour data and its a vector search so matching against many other vectors . I need a process to resort based on the full user input which I was hoping a AI model would be able to understand the question and sort the resultset correctly.

1 Like