Hello
I’m embarking on a project that has an odd use-case and would love to hear any feedback or guidance on it or the recommendation system questions at the end.
Last year I quit a 20 year overpaid career in HR & Recruitment leadership, spent a year at LSE doing data Analytics, locked myself away from friends, hobbies etc to spend 16+ hours a day running endless models.
I started coding primitive AI in C++ aged 14, did software engineering, so I had some coding background.
My aim was an entry level/internship role in AI engineering but all I had for my portfolio was endless machine learning and database examples, plus an Edge/Personal AI project using customised hardware for a ‘Smart Alexa’.
I ran out of time/money before completing all of the course but also managed a few NVIDEA and AZURE AI modules. But I need to show I am genuinely interested (obsessed) in an AI career.
The Project
Inferencing
I have begun fine tuning a ‘Llama 3.18B’ model to sit on my portfolio site as a chatbot with a difference. I’m training it on a hoard of details about myself, my values, ambitions etc… alongside more general topics from a list of ‘400 facts about me’ plus a few hundred pages of notes and tutorials on machine learning i’ve written.
RAG
With my recruiter’s hat on I realised that RAG hybrid could be useful for providing information on my CV etc. So, my Chatbot’s turning into a RAG / Inferencing hybrid.
Recommendation
After reading an article on ‘Recommendation’ systems, it occurred to me that the process of a recruiter garnering information from an applicant, isn’t that different from what these systems do.
I was thinking to add a personalised layer that anticipates interest based on the recruiters’ questions. The idea being to adapt item-based CF so that it uses both trained data and RAG to provide more specific response to those types of questions?
I can draft a large dataset of recruitment related questions (& answers) as I’ve written multiple business competency frameworks, designed the assessment material for one of the largest trainee programmes in the UK, and do pro bono work helping students from low-income background to secure trainee roles.
I’d then look to categorise/prioritise RAG or Inferencing responses to those that align with the category of question, perhaps something like:
- Chatbot receives question
- Question embedded & compared for similarity against RAG database
- CF Decision: If Strong match – Provide RAG Response, Else – Provide RAG response
- Response delivered.
However, this is my first model, so I don’t have the experience and gut instinct to know the best approach.
Option 1: Shooting from the hip, I’d say the simplest approach would be to use ‘cosine similarity’ against the RAG database to compare the question against a weighted value.
This feels like its too simple, but since I’m making it up as go along I may use it in the short term whilst I test more complex methods.
Option 2:
Pit the two models against each other, using option 1 for the RAG, and then somehow get extract a confidence rating from the inferenced model and multiply by the weight.
This feels like a better approach, and I’m guessing there must be a way to extract the confidence from the model, I’m just not sure if it would need a larger training dataset, or if adjusting the ‘question’ weights during training would suffice?
Option 3:
The only issue with option 2, is what happens if they both return low confidence? Is it possible to ‘reroll’ the same question in a different way to extract new confidence values, or would I just use llama as the backup?
Thank you so much for all the great material you guys post, I’ve got the O’Reilly book lined up also!
Method & Tech:
Locally hosted, self-built website with everything local in the NVIDEA WSL to minimise overheads with a LAN connection of 70mps upload. Running on an ASUS 4090 STRIX + 128GB DRAM.
FAISS for RAG, ONNX for inferencing speed, Sentence transformers, Scickit-learn and ‘ascio’.