Understanding regarding "Question Answering model using open-source LLM"

Hi,

I would like learn and understand how do I address the below questions, can someone please help me?

  1. currently, I use my data(20 files) to create embedding from HuggingFaceEmbeddings. Even if I have 2 millions files do I need to follow the same steps like 1.create embedding from HuggingFaceEmbeddings, 2. do similarity test, and 3. pass it to model?
  2. At what stage I need to retrain the LLM?
  3. is it possible to retrain the LLM with my own data?
  4. currently, your notebook show chromadb as vector db, In case if I want to move it production how do I host it? where do I store all my data(embeddings)? do I need to store all embedding in any database, if yes, could you please recommend any?
  5. how do I evaluated dolly LLM with my data?
  6. currently, I noticed dolly model with my data gives one wrong answer. so, how do I correct the model? if it is other model like text classification I would correct the label and retrain the model with corrected label. how do I do it here?