Llama-2-7b-chat fine-tuning

Hi! I’m interested in fine-tuning the Llama-2 chat model to be able to chat about my local .txt documents. I’m familiar with the format required for inference using the [INST]… formatting, and have been somewhat successful in using the context window to provide the model information about domain specific information, but the max context length of ~4k is too limiting.

My primary question is, how should I format the data I have to fine tune the chat model? Do I convert it to a question and answer format and create a string using the same format as I do for inference for training? Any guidance on this topic would be very much appreciated.

Hi @tozster - did you ever solve this one for yourself? I have been following various online guides for this, but I still cannot seem to get consistent fine-tuned results. I get behaviour where sometimes the responses are quite reasonable, but then often the responses include a large number of the delimiter tokens ([INST], <<SYS>> etc…). I have played with both the training format and the inference prompt but I cannot get decent consistent behaviour.

Did you ever manage to get through this and get a good result for your files?

I have have the same issue where the responses include these tokens.


Similar to inference, fine-tuning also requires chat templates.

See my demo notebook on fine-tuning Mistral-7B (I’m using the chat templates when preparing the data for the model). It would be similar for Llama-2-7B, although LLaMa uses a different chat template compared to Mistral.

Hey I am curious have anybody tried fine-tuning LlaMA-7B on any summarization tasks? or just some regular ranking tasks? I am trying to figure out the standard setup.