"Masking" the prompt / repeated portions

Hi, I’m tuning Mistral 7B (qlora) for a very specialized task which has the following structure

  1. a paragraph-long prompt
  2. a 2 paragraph assistant response
  3. another 1 sentence prompt
  4. a structured JSON assistant response. The prompt portions will always be exactly the same.

The training is converging very quickly (500-600 steps with no batching, out of a total 5000 training rows), and I’m worried it’s because it’s over fitting on the fixed portions.

My question is, should I

  1. write a custom loss function to either ignore or downweight the prompt?
  2. do away with prompting altogether and hope the model will learn the task organically?
  3. something else?

Thanks!!