Pythia Tuning Question

eggie5 · June 14, 2023, 7:21pm

i have two datasets:

~150 internal help documents (knowledge base)
~3k search query → answer pairs

I want to build a QA system.

I’ve tuned a pythia* base model on the query-answer pairs (using the Dolly v2 code) and it works pretty well for an instruction model.

My question is how should I include the KB articles (1) ?? They have lots of good internal data/concepts that the model should know about. Should first, continue the pretraining on the pythia base on the KB docs??? then instruction tune?

Topic		Replies	Views
Finetuning on base or instruct model? Beginners	0	1701	April 6, 2024
Fundamental newbie questions Beginners	1	1335	December 6, 2020
Using same instructions for fine-tuning: Is this bad for the model? Intermediate	1	458	March 26, 2024
Instruction tuning a pre-trained base model 🤗AutoTrain	0	46	December 18, 2024
Finetuning a Large Language Model Intermediate	0	82	October 23, 2024

Pythia Tuning Question

Related topics