Finetuning GPT model multiple times

heirtothedemon · August 29, 2024, 12:12pm

I am trying to train a q/a chatbot on a very large dataset. I have the text itself and another dataset that is formatted as user/assistant conversations. I want the model to learn the contents of the text and then also answer questions about it.

I know RAG is definitely a good option here but would finetuning twice also work in this case? The fintuning will happen once on the text so the model can train on the raw text data and then on the q/a so it can act as a chatbot.

raghavm1 · September 2, 2024, 12:39am

You fine-tuning approach sounds interesting - you could also try to leverage on the T5 models for this which have been pre-trained for multiple tasks including QnA in the way you explained.

But maybe this approach could lead to overfitting with questions which are out of distribution but similar-looking based on the fine-tuning dataset return the training set answers, leading to wrong outputs (since the model has seen the training answer twice, once in the raw text and the other time in the QnA form)

RAG would definitely be worth checking out with a vector database to do similarity searching and retrieval - they’re meant for the purpose you’re aiming for. it could also be faster in terms of retrieval.

Topic		Replies	Views
How to finetune RAG model with mini batches? Beginners	1	426	December 15, 2021
How to finetune with a own private data and then build chatbot on that? 🤗Transformers	4	14014	February 16, 2024
Question-Answering/Text-generation/Summarizing: Fine-tune on multiple answers Beginners	8	5326	November 20, 2021
RAG Class for Question Answering 🤗Transformers	0	446	October 22, 2020
Best way to fine-tune Question-Answer model for different questions Models	0	541	May 29, 2021

Finetuning GPT model multiple times

Related topics