Pythia Tuning Question

i have two datasets:

  1. ~150 internal help documents (knowledge base)
  2. ~3k search query → answer pairs

I want to build a QA system.

I’ve tuned a pythia* base model on the query-answer pairs (using the Dolly v2 code) and it works pretty well for an instruction model.

My question is how should I include the KB articles (1) ?? They have lots of good internal data/concepts that the model should know about. Should first, continue the pretraining on the pythia base on the KB docs??? then instruction tune?