The motivation is obviously digital immortality. I realize that meat-me will still die. This Virtual Intelligence (VI) is for the benefit of my friends and family. Not sure if it needs to be entirely from scratch. After all, meat-me has consumed billions of lines of random content on the internet, just like most LLMs. I plan on using Mistral as a starting point, but I want to go a little further than standard fine-tuning. Most importantly I’d like to use a system similar to what was allegedly used to create Chat GPT:
As you can see, in the second phase for each training step the model generates several outputs. Then a human labeler ranks the outputs from best to worst. That ranking data is used to train the reward model. I feel like I’m taking crazy pills because I can’t find any tutorial or example of someone else who has even tried incorporated human feedback into a transformer model. Is this an insurmountably difficult task for everyone else except Google engineers? Sure, it’ll take a while but this project doesn’t have a serious time limit. Ideally my digital clone maker would be a chatbot that outputs multiple results after every string of human text input is submitted, then I’d assign a 1 to 5 rating for each output. This is the main thing I want to do. All the following ideas are not really necessary if this human ranked reward model. I just want to see what people think of them.
Before the outputs are even generated, the model should classify the topic of the human input. Jokes, video games, democracy, linguistics, etc, i.e. the topics the meat-me is actually interested in. Does this make any sense to do? Would this increase the accuracy of results? My internal thought process when it comes to writing feels like I take in the topic, context, and prompt, then I output text. I’m one of those people who doesn’t really have an internal monologue. It sure doesn’t feel like I’m carefully reasoning my way through every IRL conversation. It feels like I’m saying words to achieve some objective. Maybe in addition to topics objectives would need to be generated based on the input: give opinions, offer advice, share experience, explain reasoning, win argument, critically analyze, actively listen, etc. I got this idea from trying to make a game with deep talk mechanics.
It can’t only respond to prompts and still feel like me, but I don’t think a transformer would help with that. Has anyone incorporated a timer function or additional variables on top of an LLM? Right now I’m thinking it’d need to incorporate some fluctuating stats to replicate things like my social battery or level of interest or affinity for the person who is interacting with me. For instance, if my VI didn’t talk or receive input for a day he would have a chance of making an unprovoked output. This VI would kind of be a combination of an LLM with a video game NPC.
That feature would need a permanent memory stored in the weight matrices themselves so they are retained after being shut off and back on. Remember, this software wouldn’t be for making inferences, so it’d also require the ability to have the weights updated on the fly even when it’s not being trained by me. How could these memories of conversations be stored differently than the training of the reward model? Has anyone had experience with using multiple modules connected in irregular ways a la actual brains, rather than old-fashioned giant matrices?
Ideally this would be a multimodal model that can take in sound and images then produce a reaction, but I’m not worrying about that for now.