Unsupervised fine tuning mistral 7b

after unsupervised fine tuning mistral model on custom data, I tried to ask ask question to it from my data to check if it has learnt properly.but in the answers it gave, it starts repeating the same sentence or group of words again and again. eg.
question:what is procedure for Z840 BIOS ROM Flash Update?

answer: Procedure
Shutdown Application Software, and power off the Host PC.
Insert the BIOS CD into the DVD-ROM drive on the Host Computer.
Power on the Host PC.
Press F10 as soon as your display is active and the word Setup appears in the lower right corner of the screen.
note: If you do not press F10 at the appropriate time, you must try again. Restart the Host PC and press F10 again to access the utility, or press Ctrl+Alt+Delete prior to boot if you miss the opportunity to press F10.
Select your language from the list, and press Enter.
Select Main > Replicated Setup, and press Enter.
Select the BIOS version, and press Enter.
The BIOS Settings screen appears.
Select File > Default Setup > Save Current Settings as Default.
The BIOS Settings screen disappears.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
Press Esc to exit the screen manager.
what can be the cause for this?

The result looks absolutely normal for unsupervised fine tuning. The LLM is simply trying to predict the next tokens based on the previous ones. It’s possible that you can stop the endless stream of predictions by adding an end-of-text token at the end of each document for training, but this won’t significantly improve the results.

After unsupervised fine tuning, you need to perform supervised fine tuning on an instruct dataset, which will consist of “question-answer” pairs. This will train your model in how to format responses and when to stop. Right now, you have just trained a “talker” that outputs the most likely next token, but not a “thinking” LLM that can answer questions.

Most likely, during the unsupervised fine tuning, the settings that your base model (Mistral) received during the initial fine tuning by the Mistral team got disrupted.

what is the minimum no. of question-answer pair required for supervised fine tuning an unsupervised fine tuned llm model?

IMHO the estimation can be calculated by this formula:

number of pairs ~= 10 x number of sentences in the original text

You need to prepare the different questions to the individual sentences, group of neighbour sentences, paragraphs, chapters and the text itself in different modalities (like from different roles) and forms.

Plus it would be good idea to teach not only the original text, but the related materials (like summaries, critics, etc) and the historical content (if this is relevant).

This estimation is relevant only if you want that the model “knows” the facts from the original text, If you need to copy the style and logic of the document, probably you can use about the quarter of the number of sentences in the text.

Usually the combination of base model + SFT + RAG gives better results, than pure USFT and/or SFT. At least for facts.

You can look at this series of video, where you can find the answer on your questions. For example in the second part at the 5:17 you can see the result of the based model, which looks very similar to your results:

Hi, can you please explain how you formatted your dataset or link to examples for the base model ? I’ve seen only examples for Instruct