i’m a software programmer and i know just some basic stuff about ML and AI, mainly all learned at work by my data analysts coworkers <3
I am trying to understand how from a software architectural pov i would implement something like nlp models for data extraction and llms (to make answers more appealing i guess). My intent is not to understand the whole math and statistic under it, but to understand how i whould use already existing models and what tools are best for the job.
Since i like a hands-on approach i thought i would start a small project of my own to help me and my fellow DnD adventurers
The idea is to have a small application that allows us to upload documents such as notes and session summaries, to than in later moment interrogate the db to get anwsers about characters or event occurred in the past.
I skecteched the architecture of the application and the information flow
Who i would like to understand is:
- Is the usage i am thinking for the models appropriate? Or am i misunderstanding something?
- What are some good datasets and models already existing to start experimenting this kind of usage? I tried looking around, but there is a lot on informantion and i am kinda confused on where to start…
- How do i achieve the LLM part? the idea is to feed the model the retrieved docs content on the fly and have the model formulate an anwser to the user query using that information and context
- What is some documentation you would recommend on the argument, or some youtube content you feel is a good starting point (but not too basic)
Start smaller. Get your NLP model working first. The webapp, backend etc can come later if your goal is to learn NLP. These other bits can be a distraction IMO as you are learning NLP.
Usage looks okay. what you call backend - you will likely have a bunch of different backends - one for the ML processing and another for non-ML stuff - users/session etc.
Dont start to solve your problem immediately - start small and run through the tutorials in order. Alternatively look at fast.ai offers a very different approach to learning by jumping into working examples immediately. They start with image processing and then move to NLP and you will need some patience. You may see there are multiple ways to frame your solution as a ML problem - (question answering, masked, entity recognition, sentence prediction)
to start with your LLM can just use an existing model e.g.
from transformers import pipeline
question_answerer = pipeline("question-answering")
result = question_answerer(
question="What is DnD?",
context="Some text that contains information on what DND is goes here..",
If you are familiar with python - put this code behind a fastAPI server and then query it from whatever backend/frontend frameworks. Get the “context” from your database and your question can come from the frontend.
- then over time you can replace the LLM with your own assuming you have enough data to train the model in your DND domain by following this approach. Question answering - Hugging Face NLP Course
Thank you for the tips!
The goal is to learn nlp applied to software architecture, that’s why i decided to build a project around it, but for sure i’ll start from the models.
About the backends yeah i simplified it in the sketch.
I’ll check out the stuff you linked!