RAG/ Inferencing / Recommendation combination for a model that 'knows' me

DataImaginations · September 2, 2024, 1:32am

Hello

I’m embarking on a project that has an odd use-case and would love to hear any feedback or guidance on it or the recommendation system questions at the end.

Last year I quit a 20 year overpaid career in HR & Recruitment leadership, spent a year at LSE doing data Analytics, locked myself away from friends, hobbies etc to spend 16+ hours a day running endless models.

I started coding primitive AI in C++ aged 14, did software engineering, so I had some coding background.

My aim was an entry level/internship role in AI engineering but all I had for my portfolio was endless machine learning and database examples, plus an Edge/Personal AI project using customised hardware for a ‘Smart Alexa’.

I ran out of time/money before completing all of the course but also managed a few NVIDEA and AZURE AI modules. But I need to show I am genuinely interested (obsessed) in an AI career.

The Project

Inferencing

I have begun fine tuning a ‘Llama 3.18B’ model to sit on my portfolio site as a chatbot with a difference. I’m training it on a hoard of details about myself, my values, ambitions etc… alongside more general topics from a list of ‘400 facts about me’ plus a few hundred pages of notes and tutorials on machine learning i’ve written.

RAG

With my recruiter’s hat on I realised that RAG hybrid could be useful for providing information on my CV etc. So, my Chatbot’s turning into a RAG / Inferencing hybrid.

Recommendation

After reading an article on ‘Recommendation’ systems, it occurred to me that the process of a recruiter garnering information from an applicant, isn’t that different from what these systems do.

I was thinking to add a personalised layer that anticipates interest based on the recruiters’ questions. The idea being to adapt item-based CF so that it uses both trained data and RAG to provide more specific response to those types of questions?

I can draft a large dataset of recruitment related questions (& answers) as I’ve written multiple business competency frameworks, designed the assessment material for one of the largest trainee programmes in the UK, and do pro bono work helping students from low-income background to secure trainee roles.

I’d then look to categorise/prioritise RAG or Inferencing responses to those that align with the category of question, perhaps something like:

Chatbot receives question
Question embedded & compared for similarity against RAG database
CF Decision: If Strong match – Provide RAG Response, Else – Provide RAG response
Response delivered.

However, this is my first model, so I don’t have the experience and gut instinct to know the best approach.

Option 1: Shooting from the hip, I’d say the simplest approach would be to use ‘cosine similarity’ against the RAG database to compare the question against a weighted value.

This feels like its too simple, but since I’m making it up as go along I may use it in the short term whilst I test more complex methods.

Option 2:

Pit the two models against each other, using option 1 for the RAG, and then somehow get extract a confidence rating from the inferenced model and multiply by the weight.

This feels like a better approach, and I’m guessing there must be a way to extract the confidence from the model, I’m just not sure if it would need a larger training dataset, or if adjusting the ‘question’ weights during training would suffice?

Option 3:

The only issue with option 2, is what happens if they both return low confidence? Is it possible to ‘reroll’ the same question in a different way to extract new confidence values, or would I just use llama as the backup?

Thank you so much for all the great material you guys post, I’ve got the O’Reilly book lined up also!

Method & Tech:

Locally hosted, self-built website with everything local in the NVIDEA WSL to minimise overheads with a LAN connection of 70mps upload. Running on an ASUS 4090 STRIX + 128GB DRAM.

FAISS for RAG, ONNX for inferencing speed, Sentence transformers, Scickit-learn and ‘ascio’.

Topic		Replies	Views
Best model for file scan and personality Models	1	84	March 14, 2025
Smart matching system using AI Community Calls	1	211	September 23, 2024
Seeking Advice on Processing Support Conversations for Efficient RAG Model Search Intermediate	0	50	September 9, 2024
Need Help Developing a Chatbot That Guides Users Through App Creation in a No-Code Platform Beginners	1	32	May 11, 2025
Discord Support Bot with AI Beginners	2	166	March 23, 2025

RAG/ Inferencing / Recommendation combination for a model that 'knows' me

Related topics