yes, the approach that you’re describing is actually similar to what i did before.
- A user will send a message (it can be in any form and wording)
- LLM or any intent/entity extractor model will play the role of extracting info in a form that you can process programmatically
- your business logic will take the extracted info to build a query to fetch the data from your db
- LLM will use the result to build a “nature” response then send to the user
I understand that your challenge is that you don’t have a good specs machine to deploy a good LLM model locally, which will introduce more difficulties (ex: cannot extract correct info from user’s message, slow response time…). Tbh, i had the same issue when i tried to build up a similar chatbot with limited resources.
From my experiment, Gemma is a bit better than others (Qwen3, TinyLlama, some PhoBert models) when deployed on a low-spec machine.