Integrating Chat in On-Premises Data Software?

My task is to integrate chat functionality into on-premises data documentation and data catalog software. I need to plan a solution.

I have developed a prototype using LangChain and the OpenAI API. However, this is not a production-ready solution. I have experimentally tried a few solutions from this blog: Blog Posts: Using LLMs with Streamlit, but they are typically prototypical (specifically, I did the following and expanded it a bit: snowChat: Leveraging OpenAI's GPT for SQL queries).

I’m not sure how to develop a plan for a ready solution. I know that for on-premises, I will likely need to use the LLama2 model. What should guide me in developing such a solution? There are plenty of sources on the subject online, but the solutions I’ve found are not suitable for production… Where should I start? I feel like I’m somewhat stuck.