I am new to generative AI and working on a project that relies on AI capability for recommendation. The problem is in the risk assessment where I start with huge amount of data set . I am not able to find specific model train using such custom data set.
The project will have a rest interface. No natural language interaction.
I am now thinking about fine-tuning it. Here are my thoughts:
-
Fine-tune an existing model using PEFT/LORA. I wish to know if most base models allow such fine-tuning. Lora will not update the pre-train model weights. How expensive can it get to perform PEFT considering if the data set is about 50K records?
-
Is there a way to find appropriate models to start with? I couldnt find models that are pre-trained/task-fine-tuned for the risk assessment industry
-
With time, I will collect more data(as risk assessment) from customers, and wondering if I can use it as RAG to further enhance the inference capability
-
Refining/retraining every second month. I expect that we collect a large amount of customer data every two months. Is it possible to retrain every two months using peft/lora to take into account the newly large amount of data?
-
So these is my thoughts in summary: Base mode → PEFT/LORA → RAG for inference. Every two months, refine the model (PEFT/LORA) with data collected in the last two months and again use this newly updated model with RAG
Thanks